Cloud Engineer - AI ML

Kuala Lumpur
Tetap
Sepenuh masa

7 jam lepas

You will serve as a subject‑matter expert (SME) providing Level‑3 technical support across Google Cloud’s AI/ML portfolio, with emphasis on Vertex AI, GenAI, Conversational AI, and Other AI services. The role centers on rapid, high‑quality incident response, root‑cause diagnosis, and resolution for complex customer cases—while maintaining SLOs, CSAT targets, and rigorous documentation standards across phone, email, and chat channels.Key Responsibilities

Own complex incidents end‑to‑end: triage, reproduce, diagnose, and resolve issues for AI/ML products; maintain transparent customer communication and accurate case records.
Response, diagnosis, resolution and tracking by phone, email and chat of customer support queries.
Maintain response and resolution speed as defined by SLOs.
Keep high customer satisfaction scores and follow quality standards in 90% of cases.
Assist and respond to consults from other technical support representatives through existing systems and tools.
Use existing troubleshooting tools and techniques to establish root cause for queries and provide a customer facing root cause assessment.
Understand business impact of customer issue reports and follow internal issue prioritization guidelines, provide justification on priority for a given single customer report.
Perform internal classification queries documenting classes of problems and preventative actions for further retroactive analysis.
Reactively (e.g. as a result of a query) file issue reports to Google engineers, collaborate with Google engineers to diagnose customer issues, build documentation, procedures, document desired behavior and/or steps to reproduce, and suggest code-level resolutions for complex product bugs, assist engineers to drive bugs to resolution.
Perform community management tasks as needed by the business.
Promptly and independently resolve technical incidents and escalations, with effective communication to all stakeholders internally and externally, so that no monitoring is needed by Google engineers.
Take cases involving customer-specific requirements on architectural design, provide solutions limited to a particular product (or a subset of product features).
Community contributions: solutions posts, FAQs, and guidance on best practices for AI/ML deployments and responsible AI usage.

Product Scope & Typical Case PatternsVertex AI

Introduction/AutoML: dataset ingestion, labeling, AutoML training failures, metric drift, imbalance handling.
Notebooks: environment provisioning, dependency/runtime conflicts, GPU/TPU access, kernel issues.
AI Vector Search: index build latency, recall/precision tuning, ANN configuration, embedding mismatches.
Pipelines: DAG orchestration failures, component contract issues, artifact lineage, caching.
Prediction (Online/Batch): endpoint scaling, model versioning, cold‑start latency, batch job retries.
Training: hyperparameter tuning, distributed training, accelerator utilization, checkpointing.
Model Registry: version promotion policies, metadata integrity, rollback flows.
Managed Datasets: schema evolution, governance, access control.
Explainable AI: feature attributions, baselines, compliance requests.
Feature Store: ingestion latency, online/offline store consistency, backfills.

GenAI

LLMs & GenAI Introduction: prompt engineering pitfalls, safety filters, quota/latency.
Vertex AI Gemini: model selection, context window sizing, tool‑use function calling, grounding.
Vertex AI Search & Conversation: data connectors, retrieval quality, schema/FAQ ingestion.
Discovery AI Retail Search: relevance tuning, synonym/attribute mapping, cold‑start catalogue issues.
Vertex Gen AI Studio: prototype to production handoff, evaluation harnesses.
Vertex Model Garden: model availability, versioning, licenses, tuning envelopes.

Conversational AI

Dialogflow ES/CX: intent/flow design, session state, webhook reliability, NLU regression.
CCAI Platform / CCaaS: telephony integration, routing, agent desktop, compliance.
CCAI Insights: transcript accuracy, sentiment, redaction, analytics pipelines.
Contact Center AI (General): deployment patterns, multichannel orchestration.
Speech‑to‑Text / Text‑to‑Speech: language/acoustic models, latency, accuracy, voice settings.
Agent Assist: suggestion quality, knowledge base integration, real‑time performance.

Other AI

Healthcare Data Engine (HDE): FHIR mapping, interoperability, privacy controls.
Document AI: processor selection, field extraction accuracy, batch throughput.
Vision API: model outputs, rate limits, edge cases, dataset curation.

Minimum Qualifications

Technical Support Experience (L2/L3) for cloud AI/ML platforms, with proven incident ownership, RCA delivery, and cross‑functional collaboration.
Troubleshooting & Analysis: proficiency with logs, metrics, tracing; ability to interpret model artifacts, pipeline steps, and service quotas.
Communication: customer‑friendly RCA and escalation narratives; ability to handle sensitive, high‑impact scenarios.
Language: Mandarin B2 (CEFR) mandatory; English professional working proficiency.
2-6 years of experience on google cloud or any cloud platform such as AWS or Azure

Preferred Skills & Product CertificationsVertex AI Track

AutoML, Notebooks, Pipelines, Vector Search, Training/Prediction (online/batch), Model Registry, Managed Datasets, Explainable AI, Feature Store.

GenAI Track

Gemini family on Vertex AI; Search & Conversation; Discovery AI Retail Search; Gen AI Studio; Model Garden (model selection, safety, evaluation).

Conversational Track

Dialogflow ES/CX design and troubleshooting; CCAI Platform/CCaaS integrations; CCAI Insights; STT/TTS; Agent Assist.

Other AI Track

HDE (FHIR/health data), Document AI processors, Vision API.

Certifications (nice‑to‑have)

Google Cloud Professional ML Engineer, Professional Cloud Architect/Developer, Data Engineer; Dialogflow/CCAI badges; Responsible AI training.
Relevant third‑party: conversational design, speech technologies, healthcare data standards.

About Accenture Accenture is a leading global professional services company that helps the world’s leading businesses, governments and other organizations build their digital core, optimize their operations, accelerate revenue growth and enhance citizen services—creating tangible value at speed and scale. We are a talent- and innovation-led company with approximately 791,000 people serving clients in more than 120 countries. Technology is at the core of change today, and we are one of the world’s leaders in helping drive that change, with strong ecosystem relationships. We combine our strength in technology and leadership in cloud, data and AI with unmatched industry experience, functional expertise and global delivery capability. Our broad range of services, solutions and assets across Strategy & Consulting, Technology, Operations, Industry X and Song, together with our culture of shared success and commitment to creating 360° value, enable us to help our clients reinvent and build trusted, lasting relationships. We measure our success by the 360° value we create for our clients, each other, our shareholders, partners and communities.Visit us atEqual Employment Opportunity StatementWe believe that no one should be discriminated against because of their differences. All employment decisions shall be made without regard to age, race, creed, color, religion, sex, national origin, ancestry, disability status, military veteran status, sexual orientation, gender identity or expression, genetic information, marital status, citizenship status or any other basis as protected by applicable law. Our rich diversity makes us more innovative, more competitive, and more creative, which helps us better serve our clients and our communities.

Accenture

Memohon