Open in app

Sign In

Write

Sign In

Kapil Sreedharan
Kapil Sreedharan

36 Followers

Home

About

Published in Analytics Vidhya

·Feb 7, 2021

Analyze Global Air Pollution Using Apache Spark & BigQuery

How to analyze global air pollution data on the cloud — According to WHO, 7 million people die every year from exposure to fine particles in polluted air that lead to diseases such as stroke, heart disease, lung cancer, chronic obstructive pulmonary diseases, and respiratory infections, including pneumonia. 91% of the world’s population live in places where air quality exceeds WHO…

Apache Spark

5 min read

How to Analyze Global Air Quality Using Apache Spark & BigQuery
How to Analyze Global Air Quality Using Apache Spark & BigQuery
Apache Spark

5 min read


Published in The Startup

·Nov 30, 2020

Apache Avro Demystified

Data serialization with Apache Avro — What is Apache Avro? Avro is an open-source language-agnostic data serialization framework. The schema of Avro files is specified in JSON format, making it easy to read and interpret. Files that store Avro data should always also include the schema for that data in the same file. Avro includes APIs for C, C++, C#…

Apache Avro

5 min read

Apache Avro Demystified
Apache Avro Demystified
Apache Avro

5 min read


Published in Google Cloud - Community

·Oct 12, 2020

Explore & Visualize 200+ Years of Global Temperature

Visualize observable changes in global temperature using NOAA’s historical weather data, Apache Spark, BiqQuery, and Data Studio — We have all read about and experienced the effects of climate change every day around us. We have seen numbers like: The current global average temperature is 0.85ºC …

Climate Change

6 min read

Explore & Visualize 200+ Years of Global Temperature Using Apache Spark, BigQuery, and Google Data…
Explore & Visualize 200+ Years of Global Temperature Using Apache Spark, BigQuery, and Google Data…
Climate Change

6 min read


Published in The Startup

·Aug 28, 2020

Build a Hybrid Multi-Cloud Data Lake and Perform Data Processing Using Apache Spark

Create a Multi-Cloud Data Lake using Terraform and run a configuration driven Apache Spark data pipeline on COVID-19 data — Five years back when I started working on enterprise big data platforms, the prevalent data lake architecture was to go with a single public cloud provider or on-prem platform. Quickly these data lakes grew into several terabytes to petabytes of structured and unstructured data(only 1% of unstructured data is analyzed…

Data Engineering

9 min read

Build a Hybrid Multi-Cloud Data Lake and Perform Data Processing Using Apache Spark
Build a Hybrid Multi-Cloud Data Lake and Perform Data Processing Using Apache Spark
Data Engineering

9 min read

Kapil Sreedharan

Kapil Sreedharan

36 Followers

Big Data Consultant | Learn | Build | Share https://github.com/ksree

Following
  • Franco Patano

    Franco Patano

  • 💡Mike Shakhomirov

    💡Mike Shakhomirov

  • Ivan Trusov

    Ivan Trusov

  • Pınar Ersoy

    Pınar Ersoy

  • Wang Le

    Wang Le

Help

Status

Writers

Blog

Careers

Privacy

Terms

About

Text to speech