govhost.blogg.se - Cloudera hadoop distribution vmware on mac

#Cloudera hadoop distribution vmware on mac update#

Wrote the Map Reduce jobs to parse the web logs which are stored in HDFS.Responsible for Spark Streaming configuration based on type of Input Source.Responsible for design development of Spark SQL Scripts based on Functional Specifications.Infosys Ltd Spark developer | Hillsboro | January 2016 - Current * Experience in developing service components using JDBC.Īuthorized to work in the US for any employer * Strong Problem Solving and Analytical skills and abilities to make Balanced & Independent Decisions. * Experienced in both Waterfall and Agile Development (SCRUM) methodologies * Experience working with Build tools like Maven and Ant. * Experience in database design using PL/SQL to write Stored Procedures, Functions, Triggers and strong experience in writing complex queries for Oracle. * Used Curl scripts to test RESTful Web Services.

#Cloudera hadoop distribution vmware on mac update#

* Experience in creating Reusable Transformations (Joiner, Sorter, Aggregator, Expression, Lookup, Router, Filter, Update Strategy, Sequence Generator, Normalizer and Rank) and Mappings using Informatica Designer and processing tasks using Workflow Manager to move data from multiple sources into targets. * Excellent Java development skills using J2EE, J2SE, Servlets, JSP, Spring,Hibernate, JDBC. * Involved in unit testing of Map Reduce programs using Apache MRunit. * Extensive knowledge in using SQL queries for backend database analysis. * Worked with BI tools like Tableau for report creation and further analysis from the front end. Also have knowledge of Pentaho and Informatica as another working ETL tool with Big Data.

* Worked in ETL tools like Talend to simplify Map Reduce jobs from the front end. * Worked with Big Data distributions like Cloudera (CDH 3 and 4) with Cloudera Manager.

* Extending HIVE and PIG core functionality by using custom User Defined Function's (UDF), User Defined Table-Generating Functions (UDTF) and User Defined Aggregating Functions (UDAF) for Hive and Pig. * Good working experience using Sqoop to import data into HDFS or Hive from RDBMS and exporting data back to HDFS or HIVE from RDBMS. * Very good experience of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance. * Worked on HBase to load and retrieve data for real time processing using Rest API. * Expertise in Database Design, Creation and Management of Schemas, writing Stored Procedures, Functions, DDL, DML SQL queries. * Good knowledge of No-SQL databases Cassandra, MongoDB and HBase. Strong experience on Hadoop distributions like Cloudera, and HortonWorks. * Having working experience on Cloudera Data Platform using VMware Player, Cent OS 6 Linux environment. * Experienced in writing complex MapReduce programs that work with different file formats like Text, Sequence, Xml, JSON and Avro. * Excellent knowledge on Hadoop ecosystems such as HDFS, Job Tracker, Task Tracker, Name Node,ĭata Node and Map Reduce programming paradigm

HBase, PIG, Sqoop, Spark, Kafka, Flume, ZooKeeper, Oozie, and Storm. * Having good experience in Bigdata related technologies like Hadoop frameworks, Map Reduce, Hive, Hadoop/Spark Developer 8+ years of overall IT experience in a variety of industries, which includes hands on experience on Big