Databricks

↧

Image may be NSFW.
Clik here to view.

Spark 1.5 DataFrame API Highlights: Date/Time/String Handling, Time...

September 16, 2015, 10:00 am

To try new features highlighted in this blog post, download Spark 1.5 or sign up Databricks for a free trial. A few days ago, we announced the release of Spark 1.5. This release contains major...

View Article

Image may be NSFW.
Clik here to view.

Large Scale Topic Modeling: Improvements to LDA on Spark

September 22, 2015, 9:32 am

This blog was written by Feynman Liang and Joseph Bradley from Databricks, and Yuhao Yang from Intel. To get started using LDA, download Spark 1.5 or sign up for a free trial of Databricks. What are...

View Article

Image may be NSFW.
Clik here to view.

Easier Spark Code Debugging: Real-Time Progress Bar and Spark UI Integration...

September 23, 2015, 8:58 am

To try the features mentioned in this blog, sign up for a free trial of Databricks. We are excited to introduce the integration of Spark UI in Databricks notebooks, which allows the user to understand...

View Article

Image may be NSFW.
Clik here to view.

Spark Survey Results 2015 are now available

September 24, 2015, 8:45 am

We ran the Spark Survey 2015 this summer to gain insights on how organizations are using Apache Spark. Press Release: Apache Spark Outgrowing Hadoop as Users Increasingly Move to the Cloud The results...

View Article

Image may be NSFW.
Clik here to view.

Improved Frequent Pattern Mining in Spark 1.5: Association Rules and...

September 28, 2015, 7:25 am

We would like to thank Jiajin Zhang and Dandan Tu from Huawei for contributing to this blog. To get started mining patterns from massive datasets, download Spark 1.5 or sign up for a free trial of...

View Article

Image may be NSFW.
Clik here to view.

Spark 1.5.1 and what do version numbers mean?

October 1, 2015, 10:14 pm

The inaugural Spark Summit Europe will be held in Amsterdam on October 27 – 29. Check out the full agenda and get your ticket before it sells out! We are excited to announce the availability of Apache...

View Article

Image may be NSFW.
Clik here to view.

Generalized Linear Models in SparkR and R Formula Support in MLlib

October 5, 2015, 8:26 am

To get started with SparkR, download Spark 1.5 or sign up for a free trial of Databricks. Spark 1.5 adds initial support for distributed machine learning over SparkR DataFrames. To provide an intuitive...

View Article

Image may be NSFW.
Clik here to view.

Interactive Audience Analytics With Spark and HyperLogLog

October 13, 2015, 8:00 am

This is a guest blog from Eugene Zhulenev on his experiences with Engineering Machine Learning and Audience Modeling at The Collective. At Collective, we are working not only on cool things like...

View Article

Image may be NSFW.
Clik here to view.

Call for Presentations for the 2016 Spark Summit East is Now Open

October 14, 2015, 8:15 am

We are excited to announce that the call for presentations for the second Spark Summit East is now open. Please join us in New York City on February 16 -18, 2016 to share your experience with Apache...

View Article

Image may be NSFW.
Clik here to view.

Introducing the spark-redshift package

October 19, 2015, 10:24 am

This is a guest blog from Sameer Wadkar, Big Data Architect/Data Scientist at Axiomine. The Spark Data Source API was introduced in Spark 1.2 to provide a pluggable mechanism for integration with...

View Article

Audience Modeling With Spark ML Pipelines

October 20, 2015, 6:00 am

This is a guest blog from Eugene Zhulenev on his experiences with Engineering Machine Learning and Audience Modeling at The Collective. At Collective, we heavily rely on machine learning and...

View Article

Image may be NSFW.
Clik here to view.

Introducing More Databricks Notebooks Sharing Options

October 21, 2015, 9:53 am

To try the new export features mentioned in this blog, sign up for a free trial of Databricks. Databricks notebooks function as a development environment for developers, data engineers, and data...

View Article

See You at Spark Summit EU

October 26, 2015, 2:16 am

With the first Spark Summit Europe only days away, we are excited to announce that all 900 tickets have been sold out. Spark Summit is the premier event that brings the Apache Spark community together,...

View Article

Image may be NSFW.
Clik here to view.

Visualizing Machine Learning Models

October 27, 2015, 6:03 am

To try the new visualization features mentioned in this blog, sign up for a free trial of Databricks. You’ve built your machine learning models and evaluated them with error metrics, but do the numbers...

View Article

Image may be NSFW.
Clik here to view.

It’s a Wrap! A Lookback at Spark Summit in Amsterdam

October 29, 2015, 5:24 pm

The Databricks team would like to thank the Apache Spark community and our partners for making the inaugural Spark Summit EU 2015 a great success! The Summit was a sold out event with close to 930...

View Article

Image may be NSFW.
Clik here to view.

Announcing the Spark TFOCS Optimization Package

November 2, 2015, 10:25 am

Aaron is the developer of Spark TFOCS, with support from Databricks. Aaron is a freelance software developer with experience in data infrastructure and analytics. Click-through prediction, shipping...

View Article

Image may be NSFW.
Clik here to view.

How Yesware is Using Databricks to Transition from Concept to Product

November 5, 2015, 8:25 am

This is a guest blog from Justin Mills, Data Team Lead at Yesware. To try out Databricks for your next Spark application, sign up for a free trial. Yesware provides salespeople with data-driven...

View Article

Image may be NSFW.
Clik here to view.

It’s a Wrap! A Lookback at Spark Summit in Amsterdam

November 6, 2015, 9:00 am

The Databricks team would like to thank the Apache Spark community and our partners for making the inaugural Spark Summit EU 2015 a great success! The Summit was a sold out event with close to 930...

View Article

Succinct Spark from AMPLab: Queries on Compressed RDDs

November 10, 2015, 9:54 am

This is a guest post from Rachit Agarwal and Anurag Khandelwal of the UC Berkeley AMPLab, leads of an ongoing research project called Succinct. Succinct is a distributed data store that supports a wide...

View Article

Elsevier Spark Use Cases with Databricks and Contribution to Spark Packages

November 11, 2015, 7:43 am

This is a guest blog from Darin McBeath, Disruptive Technology Director at Elsevier. To try out Databricks for your next Spark application, sign up for a free trial. Elsevier is a provider of...

View Article