Categories:

#29

Azure Databricks is ranked #29 in the Data Warehouse Tools product directory based on the latest available data collected by SelectHub. Compare the leaders with our In-Depth Report.

Azure Databricks Benefits and Insights

Why use Azure Databricks?

Key differentiators & advantages of Azure Databricks

  • Data Integration: Pulls from a multitude of data sources, such as CSV and JSON files, SQL servers, Snowflake, SQL endpoints, MySQL, PostgreSQL, Oracle and many more. Its SQL Analytics component connects to all Azure data sources that include Data Lake Storage, Cosmos DB, Synapse Analytics and Blob Storage, among others. Unifies batch and streaming data processing at scale by ensuring data reliability through optimized transactions. 
  • Unified Architecture: Save costs with clusters scaling up automatically as the workload increases and scaling down during lighter workloads. Compute faster through secure, scalable and performance-optimized Spark clusters by pulling data through Delta Lake, an open-source storage layer on top of Databricks’ data lake. 
  • Self-Service Data Processing: Easily create clusters — all-purpose or job-specific — through its UI for autonomous data processing. Set up clusters per preset configurations, with total administrative control over usage access and performance monitoring through logs. 
  • Data Analysis: Its Data Science Workspaces enable writing commands in R, Python, Scala and SQL to find insights quicker. Create visualizations through interactive point-and-click action, or leverage matplotlib, ggplot and D3 to visualize and analyze data. Collaborate on notebooks in real-time and track changes through versioning. 
  • ML-Based Modeling: With one-click, access its ready-to-use, scalable and optimized machine learning environments during any phase of their life cycles. Move from prototyping to production through a single platform for data prep and model building. 
  • Integration with Microsoft Suite: Authenticate via existing Azure Active Directory credentials and SAML 2.0. Integrates easily with Microsoft products that include Data Lake, Data Warehouse, Blob Storage and Event Hub. Keeps data secure according to FedRAMP standards. 
  • Data Governance: Ensure governance compliance with granular role-based access control, versioning and tracking data lineage. Control access to dashboards, endpoints, queries and alerts through access control lists (ACLs). Use personal access tokens to authenticate to its REST APIs and BI tools. 
  • Collaboration: Build data pipelines with any language of choice, such as Python, Scala, R and SQL. Collaborate on an open platform for data querying, data model creation and machine learning. Run analytics and troubleshoot issues by tracking usage through version control via Github and Azure DevOps.  

Industry Expertise

Azure Databricks provides data analytics capabilities to clients in diverse industries across the globe. Some of these are telecommunications, oil and gas, packaged goods, securities, gaming, freight, learning and enablement, information and analytics consultancies, biotech and many others.

Azure Databricks Reviews

Average customer reviews & user sentiment summary for Azure Databricks:

User satisfaction level icon: great

197 reviews

88%

of users would recommend this product

Key Features

  • ML-Based Analytics: It offers a big data computing platform for Azure Machine Learning through parallelization. Process big data efficiently by creating Azure ML experiments through the Python SDK. Attach notebooks or PySpark scripts to pipelines within Azure ML for specific machine learning tasks. 
  • SQL Analytics: Share insights with others through easy-to-understand dashboards that combine visuals with text. Run ad-hoc queries on SQL endpoints in its data lake to glean actionable insights. Secure data with role-based access and enterprise-grade SLAs. 
  • Dashboards: Create dynamic reports from interactive dashboards and visualization types that include bar, pie, line, map, area, histogram, scatter, pivot and legacy charts, with built-in toolkits. Display visualizations of ML-based sample data with training parameters and results, such as ROCs, residuals and decision trees. 
  • Alerts: Monitor business metrics by setting alerts and integrate them with other workflows, such as raising a support request. Get notified when a field value in response to a scheduled query meets a threshold. 
  • Workspace with Apache Spark: The Apache Spark ecosystem includes SQL for querying structured data, streaming analytics for real-time data processing, machine learning libraries and graph computation for analytics. 
  • Data Ingestion: Ingest raw data from any source, schedule data transformations, version tables and keep data ready for analysis. Auto Loader is a cloud source for Apache Spark that continuously loads and updates new, raw data from cloud storage as it is added, with low cost and latency. 
  • Data Preparation: Prepare and cleanse data derived from multiple sources, while maintaining its integrity and reliability, through separate data prep tables thanks to Delta Lake. Ensures the continuous flow of the latest, up-to-date data through the data lake and on to stakeholders for the latest business insight. 
  • Databricks Connect: Connects a preferred IDE (i.e. intelliJ, Eclipse, PyCharm, Rstudio or Visual Studio), a notebook server (i.e. Zeppelin or Jupyter) and other applications to its clusters. 
  • Notebooks: Create data-intensive applications through notebooks that support multiple languages like Python, R, Scala and SQL. Use Jupyter Notebooks for machine learning and big data analytics and deploy instantly to production by tweaking parameters such as data sources and output directories. 

Limitations

At the time of this review, these are the limitations according to user feedback:

  •  Cluster, job and workspace access control is available only in the Premium Plan 
  •  Doesn’t allow running of any code that is not part of a Spark job on a remote cluster 
  •  Doesn’t allow Azure Active Directory credential passthrough on non-Standard clusters running Databricks Runtime 7.3 LTS and below 
  •  Support is not available for Databricks Jobs Light Compute workloads 

Suite Support

Before emailing support, access the documentation, knowledge base, training guides, tutorials, best practices and user forums on the vendor’s website for self-paced resolution of issues and queries. The vendor offers three support options — Business, Enhanced and Production. Subscribers to the Business plan and above have access to shorter support response times, a Slack channel and Spark technical experts. Subscribe to the Production plan to get 24/7 support access throughout the year.

Support business hours are from 9 a.m. to 6 p.m. Monday through Friday, excluding U.S. holidays, in local North American, Central European or Singapore/China time zone.

mail_outlineEmail: Not available. Submit a form on the Contact Us page on the vendor’s website.
phonePhone: Call 1-866-330-0121.
schoolTraining: Attend instructor-led training courses — live and virtual — or access self-paced learning modules on the vendor’s website. Get certified as an SQL analyst, a data engineer or a data scientist with the Databricks Academy.
local_offerTickets: Create a support ticket in the Help Center after signing in to a user account.
Your review has been submitted
and should be visible within 24 hours.
Your review

Rate the product