Spark 1.5 DataFrame API Highlights: Date/Time/String Handling, Time...
To try new features highlighted in this blog post, download Spark 1.5 or sign up Databricks for a free trial. A few days ago, we announced the release of Spark 1.5. This release contains major...
View ArticleLarge Scale Topic Modeling: Improvements to LDA on Spark
This blog was written by Feynman Liang and Joseph Bradley from Databricks, and Yuhao Yang from Intel. To get started using LDA, download Spark 1.5 or sign up for a free trial of Databricks. What are...
View ArticleEasier Spark Code Debugging: Real-Time Progress Bar and Spark UI Integration...
To try the features mentioned in this blog, sign up for a free trial of Databricks. We are excited to introduce the integration of Spark UI in Databricks notebooks, which allows the user to understand...
View ArticleSpark Survey Results 2015 are now available
We ran the Spark Survey 2015 this summer to gain insights on how organizations are using Apache Spark. Press Release: Apache Spark Outgrowing Hadoop as Users Increasingly Move to the Cloud The results...
View ArticleImproved Frequent Pattern Mining in Spark 1.5: Association Rules and...
We would like to thank Jiajin Zhang and Dandan Tu from Huawei for contributing to this blog. To get started mining patterns from massive datasets, download Spark 1.5 or sign up for a free trial of...
View ArticleSpark 1.5.1 and what do version numbers mean?
The inaugural Spark Summit Europe will be held in Amsterdam on October 27 – 29. Check out the full agenda and get your ticket before it sells out! We are excited to announce the availability of Apache...
View ArticleGeneralized Linear Models in SparkR and R Formula Support in MLlib
To get started with SparkR, download Spark 1.5 or sign up for a free trial of Databricks. Spark 1.5 adds initial support for distributed machine learning over SparkR DataFrames. To provide an intuitive...
View ArticleInteractive Audience Analytics With Spark and HyperLogLog
This is a guest blog from Eugene Zhulenev on his experiences with Engineering Machine Learning and Audience Modeling at The Collective. At Collective, we are working not only on cool things like...
View ArticleCall for Presentations for the 2016 Spark Summit East is Now Open
We are excited to announce that the call for presentations for the second Spark Summit East is now open. Please join us in New York City on February 16 -18, 2016 to share your experience with Apache...
View ArticleIntroducing the spark-redshift package
This is a guest blog from Sameer Wadkar, Big Data Architect/Data Scientist at Axiomine. The Spark Data Source API was introduced in Spark 1.2 to provide a pluggable mechanism for integration with...
View ArticleAudience Modeling With Spark ML Pipelines
This is a guest blog from Eugene Zhulenev on his experiences with Engineering Machine Learning and Audience Modeling at The Collective. At Collective, we heavily rely on machine learning and...
View ArticleIntroducing More Databricks Notebooks Sharing Options
To try the new export features mentioned in this blog, sign up for a free trial of Databricks. Databricks notebooks function as a development environment for developers, data engineers, and data...
View ArticleSee You at Spark Summit EU
With the first Spark Summit Europe only days away, we are excited to announce that all 900 tickets have been sold out. Spark Summit is the premier event that brings the Apache Spark community together,...
View ArticleVisualizing Machine Learning Models
To try the new visualization features mentioned in this blog, sign up for a free trial of Databricks. You’ve built your machine learning models and evaluated them with error metrics, but do the numbers...
View ArticleIt’s a Wrap! A Lookback at Spark Summit in Amsterdam
The Databricks team would like to thank the Apache Spark community and our partners for making the inaugural Spark Summit EU 2015 a great success! The Summit was a sold out event with close to 930...
View ArticleAnnouncing the Spark TFOCS Optimization Package
Aaron is the developer of Spark TFOCS, with support from Databricks. Aaron is a freelance software developer with experience in data infrastructure and analytics. Click-through prediction, shipping...
View ArticleHow Yesware is Using Databricks to Transition from Concept to Product
This is a guest blog from Justin Mills, Data Team Lead at Yesware. To try out Databricks for your next Spark application, sign up for a free trial. Yesware provides salespeople with data-driven...
View ArticleIt’s a Wrap! A Lookback at Spark Summit in Amsterdam
The Databricks team would like to thank the Apache Spark community and our partners for making the inaugural Spark Summit EU 2015 a great success! The Summit was a sold out event with close to 930...
View ArticleSuccinct Spark from AMPLab: Queries on Compressed RDDs
This is a guest post from Rachit Agarwal and Anurag Khandelwal of the UC Berkeley AMPLab, leads of an ongoing research project called Succinct. Succinct is a distributed data store that supports a wide...
View ArticleElsevier Spark Use Cases with Databricks and Contribution to Spark Packages
This is a guest blog from Darin McBeath, Disruptive Technology Director at Elsevier. To try out Databricks for your next Spark application, sign up for a free trial. Elsevier is a provider of...
View Article