Databricks

↧

Image may be NSFW.
Clik here to view.

AML Solutions at Scale Using Databricks Lakehouse Platform

July 16, 2021, 9:23 am

Anti-Money Laundering (AML) compliance has been undoubtedly one of the top agenda items for regulators providing oversight of financial institutions across the globe. As AML evolved and became more...

View Article

Image may be NSFW.
Clik here to view.

Unlocking The Power of Health Data With a Modern Data Lakehouse

July 19, 2021, 8:54 am

A single patient produces approximately 80 megabytes of medical data every year. Multiply that across thousands of patients over their lifetime, and you’re looking at petabytes of patient data that...

View Article

Image may be NSFW.
Clik here to view.

How Databricks’ Data Team Built a Lakehouse Across 3 Clouds and 50+ Regions

July 14, 2021, 9:00 am

The internal logging infrastructure at Databricks has evolved over the years and we have learned a few lessons along the way about how to maintain a highly available log pipeline across multiple clouds...

View Article

The Three Things CXO’s Prioritize in Their Data and AI Strategy

July 20, 2021, 10:00 am

Leveraging data (internal and external) and customer analytics to innovate and create competitive advantages is more powerful than it has ever been. This popular practice is fueled by the growing...

View Article

Image may be NSFW.
Clik here to view.

Top Considerations When Migrating Off of Hadoop

July 22, 2021, 8:00 am

Apache Hadoop was created more than 15 years ago as an open source, distributed storage and compute platform designed for large data sets and large-scale batch processing. Early on, it was cheaper than...

View Article

Image may be NSFW.
Clik here to view.

Improving Patient Insights With Textual ETL in the Lakehouse Paradigm

July 22, 2021, 9:00 am

This is a collaborative post from Databricks and Forest Rim Technology. We thank Bill Inmon, Founder and CEO, and Mary Levins, Chief Data Officer, of Forest Rim for their contributions. The amount...

View Article

Image may be NSFW.
Clik here to view.

Monitoring ML Models With Model Assertions

July 22, 2021, 11:08 am

This is a collaborative post from Databricks and the Stanford University Computer Science Department. We thank Daniel Kang, Deepti Raghavan and Peter Bailis of Stanford University for their...

View Article

Image may be NSFW.
Clik here to view.

The Delta Between ML Today and Efficient ML Tomorrow

July 22, 2021, 12:02 pm

Delta Lake and MLflow both come up frequently in conversation but often as two entirely separate products. This blog will focus on the synergies between Delta Lake and MLflow for machine learning use...

View Article

Image may be NSFW.
Clik here to view.

Getting Started With Ingestion into Delta Lake

July 23, 2021, 8:00 am

Ingesting data can be hard and complex since you either need to use an always-running streaming platform like Kafka or you need to be able to keep track of which files haven’t been ingested yet. In...

View Article

Image may be NSFW.
Clik here to view.

Augment Your SIEM for Cybersecurity at Cloud Scale

July 23, 2021, 9:31 am

Over the last decade, security incident and event management tools (SIEMs) have become a standard in enterprise security operations. SIEMs have always had their detractors. But the explosion of cloud...

View Article

Databricks Lecture Series at UC Berkeley School of Information

July 29, 2021, 8:00 am

This is a collaborative post from Databricks and UC Berkeley. We thank Tia Foss, Director of Philanthropy, UC Berkeley School of Information, for her contributions. Databricks began in the computer...

View Article

Image may be NSFW.
Clik here to view.

An Experimentation Pipeline for Extracting Topics From Text Data Using PySpark

July 29, 2021, 10:00 am

This post is part of a series of posts on topic modeling. Topic modeling is the process of extracting topics from a set of text documents. This is useful for understanding or summarizing large...

View Article

Image may be NSFW.
Clik here to view.

How We Built Databricks on Google Kubernetes Engine (GKE)

August 6, 2021, 8:59 am

Our release of Databricks on Google Cloud Platform (GCP) was a major milestone toward a unified data, analytics and AI platform that is truly multi-cloud. Databricks on GCP, a jointly-developed service...

View Article

Image may be NSFW.
Clik here to view.

5 Key Steps to Successfully Migrate From Hadoop to the Lakehouse Architecture

August 6, 2021, 11:22 am

The decision to migrate from Hadoop to a modern cloud-based architecture like the lakehouse architecture is a business decision, not a technology decision. In a previous blog, we dug into the reasons...

View Article

Introducing Support for gp3, Amazon’s New General Purpose SSD Volume

August 10, 2021, 7:56 am

Databricks clusters on AWS now support gp3 volumes, the latest generation of Amazon Elastic Block Storage (EBS) general purpose SSDs. gp3 volumes offer consistent performance, cost savings and the...

View Article

Image may be NSFW.
Clik here to view.

How We Achieved High-bandwidth Connectivity With BI Tools

August 11, 2021, 8:30 am

Business Intelligence (BI) tools such as Tableau and Microsoft Power BI are notoriously slow at extracting large query results from traditional data warehouses because they typically fetch the data in...

View Article

Image may be NSFW.
Clik here to view.

Announcing the Databricks Beacons Program

August 12, 2021, 12:17 pm

With roots in academia and open source, we know much of Databricks’ success is due to the community- the data scientists, data engineers, developers, data architects, data analysts, open-source...

View Article

How Building Apache Zeppelin Led Me to Databricks

August 12, 2021, 1:00 pm

Today, I am excited to announce that I have officially joined Databricks as an Engineer on the Data Science team. This move comes after over a year of founding and running Staroid, a cloud-based...

View Article

Image may be NSFW.
Clik here to view.

Getting to Know Databricks India

August 18, 2021, 8:00 am

India is a vast country with extreme variations. A one-size-fits-all workplace does not do it justice. Although continued urbanization, transportation, and infrastructure are the foundation for...

View Article

Image may be NSFW.
Clik here to view.

Mastering the Next Level: Leveraging Data and AI in the Gaming Sector

August 18, 2021, 10:36 am

How do you take 10k events per second from 30M users to create a better gamer experience? How can a small data team build more automated workflows to grow impact across all business units, from finance...

View Article