Top Three Data Sharing Use Cases With Delta Sharing
Data sharing has become an essential component to drive business value as companies of all sizes look to securely exchange data with their customers, suppliers and partners. According to a recent...
View ArticleImproving Drug Safety With Adverse Event Detection Using NLP
The World Health Organization defines pharmacovigilance as “the science and activities relating to the detection, assessment, understanding and prevention of adverse effects or any other...
View ArticleUnderstanding New Years Trends: A Simple, Unified Pipeline on the Databricks...
For many people, the start of a new year marks the perfect time to make a change. That’s why, despite the rather polarizing nature, New Year’s resolutions remain an important tradition for kickstarting...
View ArticleHunting Anomalous Connections and Infrastructure With TLS Certificates
According to Sophos, 46% of all malware now uses Transport Layer Security (TLS) to conceal its communication channels. A number that has doubled in the last year alone. Malware, such as LockBit...
View ArticleTop Four Characteristics of Successful Data and AI-driven Companies
At Databricks, we have had the opportunity to help thousands of organizations modernize their data architectures to be cloud-first and extract value from their data at scale with analytics and AI. Over...
View ArticleMy First 90 Days as a Databricks Engineering Leader
I recently joined Databricks as Site Lead for the Seattle office and lead engineering for the Partner Platform team. This career move builds off of my 18+ years in the data and platforms space. I...
View ArticleDelta Sharing Release 0.3.0
We are excited for the release of Delta Sharing 0.3.0, which introduces several key improvements and bug fixes, including the following features: Delta Sharing is now available for Azure Blob Storage...
View ArticleBeyond LDA: State-of-the-art Topic Models With BigARTM
Introduction This post follows up on the series of posts in Topic Modeling for text analytics. Previously, we looked at the LDA (Latent Dirichlet Allocation) topic modeling library available within...
View ArticleWhy We Invested in Labelbox: Streamline Unstructured Data Workflows in a...
Last month, Databricks announced the creation of Databricks Ventures, a strategic investment vehicle to foster the next generation of innovation and technology harnessing the power of data and AI. We...
View ArticleTaming JavaScript Exceptions With Databricks
This post is a part of our blog series on our frontend work. You can see the previous one on “Simplifying Data + AI, One Line of TypeScript at a Time.” and “Building the Next Generation Visualization...
View ArticleHunters and Databricks Ventures Partner for Advanced Security on the Lakehouse
Modern security teams must quickly detect, investigate and respond to threats to minimize their impact and better mitigate the risk to the organization. With the growth of modern IT infrastructure,...
View ArticleBuilding Data Applications on the Lakehouse With the Databricks SQL Connector...
We are excited to announce General Availability of the Databricks SQL Connector for Python. This follows the recent General Availability of Databricks SQL on Amazon Web Services and Azure. Python...
View ArticleCreating a Faster TAR Extractor
Tarballs are used industry-wide for packaging and distributing files, and this is no different at Databricks. Every day we launch millions of VMs across multiple cloud providers. One of the first steps...
View ArticleInvesting in TickSmith: Enabling an E-Commerce Data Experience With Open Data...
We are excited to announce Databricks Ventures’ investment in TickSmith, a leading SaaS platform that simplifies the online data shopping experience. The investment through the Lakehouse Fund, created...
View ArticleOrchestrating Databricks Workloads on AWS With Managed Workflows for Apache...
In this blog, we explore how to leverage Databricks’ powerful jobs API with Amazon Managed Apache Airflow (MWAA) and integrate with Cloudwatch to monitor Directed Acyclic Graphs (DAG) with...
View ArticleThe Ubiquity of Delta Standalone: Java, Scala, Hive, Presto, Trino, Power BI,...
We are excited for the release of Delta Connectors 0.3.0, which introduces support for writing Delta tables. The key features in this release are: Delta Standalone Write functionality – This release...
View ArticleMake Your Data Lakehouse Run, Faster With Delta Lake 1.1
Delta Lake 1.1 improves performance for merge operations, adds the support for generated columns and improves nested field resolution With the tremendous contributions from the open-source community,...
View ArticleStreamline MLOps With MLflow Model Registry Webhooks
As machine learning becomes more widely adopted, businesses need to deploy models at speed and scale to achieve maximum value. Today, we are announcing MLflow Model Registry Webhooks, making it easier...
View ArticleScaling SHAP Calculations With PySpark and Pandas UDF
Motivation With the proliferation of applications of Machine Learning (ML) and especially Deep Learning (DL) models in decision making, it is becoming more crucial to see through the black box and...
View ArticleGoogle Datastream Integration With Delta Lake for Change Data Capture
This is a collaborative post between the data teams as Badal, Google and Databricks. We thank Eugene Miretsky, Partner, and Steven Deutscher-Kobayashi, Senior Data Engineer, of Badal, and Etai...
View Article