That’s a Wrap! Spark Live Draws Huge Audience in Los Angeles.
As we continue our road show across the United States, there has been one observation that has been true along our first two stops — there is an unquestionable thirst for Apache Spark knowledge. Spark...
View ArticleApache Spark 2.0 Preview: Machine Learning Model Persistence
Spark Summit 2016 will be held in San Francisco on June 6–8. Check out the full agenda and get your ticket before it sells out! Try the ML Pipeline Persistence Notebook on Databricks Community Edition...
View ArticlePreview Apache Spark 2.0: An Anthology of Technical Assets
Older anthologies collated a collection of contributions from various authors around a theme—bounded then as a journal or periodical. Newer anthologies include multiple modals of expressions—digitized...
View ArticleDatabricks to Launch First of Five Free Big Data Courses on Apache Spark
Databricks helps hundreds of organizations use Apache Spark to answer important questions by analyzing data. Apache Spark is an open-source data processing engine for engineers and analysts that...
View ArticleNot Your Father’s Database: How to Use Apache Spark Properly in your Big Data...
Two months ago, we held a live webinar — Not Your Father’s Database: How to Use Apache Spark Properly in your Big Data Architecture — which covered a series of use cases where you can store your data...
View ArticleDatabricks Community Edition is now Generally Available
We are excited to announce the General Availability (GA) of Databricks Community Edition (DCE). As a free version of the Databricks service, DCE enables everyone to learn and explore Apache Spark, by...
View ArticleAchieving End-to-end Security for Apache Spark with Databricks
Today we are excited to announce the completion of the first phase of the Databricks Enterprise Security (DBES) framework. We are proud to say that this makes Databricks the first and only company to...
View ArticleAnother Record-Setting Spark Summit
The lure of San Francisco is indisputable as is its position as the preeminent high-tech hub. On day one of Spark Summit 2016, the largest community event dedicated to Apache Spark, drew more than...
View ArticleAn Introduction to Writing Apache Spark Applications on Databricks
Try this Notebook in Databricks Community Edition This is part 1 of a 3 part series providing a gentle introduction to writing Apache Spark applications on Databricks. When I first started learning...
View ArticleSQL Subqueries in Apache Spark 2.0
Try the notebook in Databricks In the upcoming Apache Spark 2.0 release, we have substantially expanded the SQL standard capabilities. In this brief blog post, we will introduce subqueries in Apache...
View ArticleAccess Control for Databricks Jobs
Secure your production workloads end-to-end with Databricks’ comprehensive access control system Databricks offers role-based access control for clusters and workspace to secure infrastructure and user...
View ArticleSpark Summit EU 2017 Recap and Reflections
“Dublin is now a truly cosmopolitan capital, with an influx of people, energy, and ideas infusing the ever-beguiling, multi-layered city with fresh flavors and kaleidoscopic colors,” writes the Lonely...
View ArticleWhat AWS Per-Second Billing Means for Big Data Processing
Databricks, the Unified Analytics Platform, has always been a cloud-first platform. We believe in the scalability and elasticity of the cloud so that customers can easily run their large production...
View ArticleIntroducing Command Line Interface for Databricks Developers
Introduction As part of Unified Analytics Platform, Databricks Workspace along with Databricks File System (DBFS) are critical components that facilitate collaboration among data scientists and data...
View ArticleIntroducing Azure Databricks
99% of organizations still struggle to get valuable analytics from their Big Data and achieve the full potential of AI. Today I’m excited to announce a new partnership with Microsoft that represents a...
View ArticleA Technical Overview of Azure Databricks
This is a joint blog post from Matei Zaharia, Chief Technologist at Databricks and Peter Carlin, Distinguished Engineer at Microsoft. Today at Microsoft Connect(); we introduced Azure Databricks, an...
View ArticleCloud-based Relational Database Management Systems at Databricks
Databricks and Microsoft have jointly developed a new cloud service called Microsoft Azure Databricks, which makes Apache Spark analytics fast, easy, and collaborative on the Azure cloud. Not only does...
View ArticleWomen in Big Data, Apache Spark and AI: Bay Area Spark Meetup at Databricks...
When Fei-Fei Li, the director of Stanford’s AI Lab and now a chief scientist at Google Cloud, was asked in an interview in the MIT Technical Review: The Artificial Intelligence Issue why she advocated...
View ArticleDatabricks Achieves AWS Machine Learning Competency Status
Today we announced that Amazon has awarded Databricks with the Amazon Web Services (AWS) Machine Learning (ML) Competency status. This designation recognizes Databricks for enabling data scientists and...
View ArticleTransparent Autoscaling of Instance Storage
Big data workloads require access to disk space for a variety of operations, generally when intermediate results will not fit in memory. When the required disk space is not available, the jobs fail. To...
View Article