We have an immediate opportunity with one of our direct clients. Please find the job description below and if you are interested, please forward your resume and share below details:
Work Authorization
Hourly Payrate expected (W2):
Month & Day Of Birth (MM/DD)
Present Location & Zip-code:
LinkedIn
Job Title: Cluster Engineer
Location: Hillsboro, OR
Duration: 6 Months Contract
100% remote
Laptop provided
High School Diploma / GED Is Required
Client is looking for an experienced cluster administrator to manage our Client Labs academic research cluster with 200+ compute nodes. The right candidate will have experience with popular HPC technologies, be able to work well independently for periods of time, and be interested in learning quickly by being challenged to solve complex problems.
Responsibilities
Serve as the cluster administrator for Intel Labs academic research cluster
Driving deployments of new servers of varying types depending on need. Likely to involve engagement with product teams to help perform best known methods and configuration.
Technical escalation point for first level support technicians likely to include topics involving job submissions, resource requests, troubleshooting, etc.
Monitor and report utilization of cluster resources including compute nodes and NAS storage leveraging existing tools such as Zabbix and potential usage of Grafana + Prometheus.
Serve as the owner of the SLURM job scheduler, defining the configuration that best fits the user base and developing/enabling advanced features as applicable.
Maintain software stack including Intel OneAPI, OpenHPC, and one off package install requests across compute nodes utilizing existing Chef configuration management tool.
Educate users who have varying levels of experience how to best utilize the cluster when they reach out with questions. Regularly update the documentation provided to users with the most relevant details to improve their experience.
Desired Skills
Good communication skills. You can effectively communicate with a variety of stakeholders, including presenting plans to higher management and having technical discussions with engineers/scientists.
Experience designing and managing large clusters with heterogeneous HW (CPUs, GPUs, etc.)
User-centric and results oriented. You can learn from data what the needs of our scientists/engineers will and can produce a cluster growth plan to fulfill these needs
Power user. You are willing to extensively test the different workflows that run in the cluster and help optimize them.
Cluster tech stack. You are an expert on cluster orchestration and management, familiar with technologies such as SLURM, Docker, Zabbix, Chef, etc. (or you are willing to learn them quickly)
Seniority level
Entry level
Employment type
Contract
Job function
Engineering and Information Technology
Industries
IT Services and IT Consulting
Referrals increase your chances of interviewing at Infobahn Softworld Inc by 2x