Data source connectivity API integration Formatting and cleaning Source code
If you want to build or customize your data pipeline, this is the perfect gig for you. As data engineer Experienced, who has been working in this field for a long time, I can build complete data pipelines involving ETL (Extract, Transform and Load) operations using Python and integration with cloud functionalities.
I am going to do the following in this process:-
1. Extraction/Connection:
- From any type of website, including e-commerce, commercial sites
- From the site required to log in
- From a website that has hidden APIs
- Develop custom scrapers
- Database integration with scrapers
2. Transformation:
- Fusion
- Adding
- Summarizing
- Filtered out
- Enriching
- Split
- Join
- Duplicate data removal and many more...
3. Upload to any cloud storage or locally.
- Locally: In any type of database (MySQL, MongoDB, Postgres, MariaDB, etc...) or in any flat file such as (JSON, CSV, TSV, etc...)
- In the Cloud: AWS S3, Google Cloud Storage, Azure Blob Storage, etc... (any cloud service you want)