I am an experienced data engineer with a strong background in building data platforms and creating services to expose data in various formats. My expertise lies in the realm of Big Data, and I have a deep understanding of the following tools and technologies:
- Core Hadoop: I am well-versed in Apache Hadoop, which serves as the foundation for distributed processing and storage of large datasets.
- Hive: With Apache Hive, I can create data warehousing solutions and perform efficient data querying using a SQL-like interface.
- Spark: Apache Spark is a powerful framework for distributed data processing and analytics, and I have extensive experience working with it.
- HBase: I am proficient in Apache HBase, a distributed NoSQL database, which enables high-speed random access to vast amounts of data.
- Elasticsearch: I have expertise in Elasticsearch, a search and analytics engine that facilitates rapid data indexing and retrieval.
- Ranger and Atlas: I am familiar with Apache Ranger and Apache Atlas, which provide security and metadata management capabilities, respectively.
- HDInsight: I have worked with Microsoft's HDInsight, a cloud-based service that simplifies the deployment and management