Job Description :
- Should have worked on Data Sourcing & Data Privacy for Generative AI projects
- Strong Hands-on skills for Data pipeline orchestration
- Knowledge of Master Data Management concepts and techniques including modelling, data loads, Data lineage, metadata and data usage
- Experience with one or more EDM toolsets such as OpenText, IBM FileNet, Microsoft SharePoint, and Oracle WebCenter Suite
- Develop best practices, standards, and methodologies to assist in the implementation and execution of Data Governance
- Designing, developing, and researching Machine Learning systems, models, and schemes through Data Science
- Administer the processes for receiving, documenting, tracking and investigating all complaints regarding alleged breaches of Privacy Policies and Practices
- Identify Privacy and Data protection-related risks and driving mitigation efforts throughout the organization
- Should be able to define / Implement Data Protection & Privacy strategies that protect consumers / employee data
- Familiarity with Deep Learning, Machine Learning and NLP / NLG frameworks (like Keras, TensorFlow or PyTorch etc.), HuggingFace Transformers and libraries (like scikit-learn, spacy, gensim, CoreNLP etc.)
- Should have experience in AWS services such as SageMaker, Elasticsearch, and general knowledge of AWS architecture & other services
- Solid knowledge and understanding of supervised, unsupervised and reinforcement learning machine learning algorithms.
- Understanding of the current state of AI / ML, Large Language Models, and Generative AI techniques.
- Identify Data, Data Quality Verification & Validation, Prepare datasets, develop predictive models and Algos, Data Insights, Test & Validate algorithm and models using statistical and other visualizations
- Hands-on experience on one or more LLM models (GPT, LLaMA, BLOOM, BERT, T5, PaLM, Meta, Google Gen AI Studio etc)
- Hands-on experience in AI Cloud Tools (AWS Sage Maker, Tensor Flow, PyTorch, MS Azure, OpenAI, Hugging Face), MLOps / AIOps (domino, mlflow), Low Code RPA (Appian, UI Path), Big Data, Python, Java JS, Full Stack, No SQL, API, Docker & Kubernetes
Basic Skill Set :
Data & Business Intelligence :
Data Architecture, Data Engineering, Data Governance, Data Quality, Data Lake and Data Warehouse, Data Science, DataOps, Data Discovery, Enterprise BI, Data VisualizationData Science Toolkit :
IDEs, Jupyter, Data Analysis & Scinetific Computations libraries (NumPy, SciPy, Pandas, SciKit, Matplotlib), Tableau, Plotly, Log AnalyticsAI / ML & Analytics :
Artificial Intelligence & Machine Learning, Model Building & Deployment, Generative AI, Architectures (Transformers / Diffusion), LLMs, Vector Database, MLOpsCloud :
Cloud Assessment, Enablement, Migration, Deployement (AWS, Azure & GCP), Cloud Data WarehouseDatabases :
Teradata, MSFT SQL Server, Oracle, NoSQL databases (MongoDB), ELK, Snowflake, Postgres, MySQLData Management Tools & Web Services :
Informatica ETL, Eclipse IDE, REST / SOAP web servicesAI Algorithms :
Linear / Logistic Regressions, Classification, Clustering, NLP, LSTM, Time Series Analysis, Ensemble Techniques (Decision Trees etc), Sentimental AnalysisPeople Leadership :
Stakeholder / Team Management(ref : hirist.tech)