NVIDIA NCP-AIO - NCP - AI Operations Exam

Question #6 (Topic: Exam A)
You are managing a Slurm cluster with multiple GPU nodes, each equipped with different types of GPUs. Some jobs are being allocated GPUs that should be reserved for other purposes, such as display rendering.
How would you ensure that only the intended GPUs are allocated to jobs?
A. Verify that the GPUs are correctly listed in both gres.conf and slurm.conf, and ensure that unconfigured GPUs are excluded. B. Use nvidia-smi to manually assign GPUs to each job before submission. C. Reinstall the NVIDIA drivers to ensure proper GPU detection by Slurm. D. Increase the number of GPUs requested in the job script to avoid using unconfigured GPUs.
Answer: A
Question #7 (Topic: Exam A)
A data scientist is training a deep learning model and notices slower than expected training times. The data scientist alerts a system administrator to inspect the issue. The system administrator suspects the disk IO is the issue.
What command should be used?
A. tcpdump B. iostat C. nvidia-smi D. htop
Answer: B
Question #8 (Topic: Exam A)
You have noticed that users can access all GPUs on a node even when they request only one GPU in their job script using --gres=gpu:1. This is causing resource contention and inefficient GPU usage.
What configuration change would you make to restrict users’ access to only their allocated GPUs?
A. Increase the memory allocation per job to limit access to other resources on the node. B. Enable cgroup enforcement in cgroup.conf by setting ConstrainDevices=yes. C. Set a higher priority for Jobs requesting fewer GPUs, so they finish faster and free up resources sooner. D. Modify the job script to include additional resource requests for CPU cores alongside GPUs.
Answer: B
Question #9 (Topic: Exam A)
A new researcher needs access to GPU resources but should not have permission to modify cluster settings or manage other users.
What role should you assign them in Run:ai?
A. L1 Researcher B. Department Administrator C. Application Administrator D. Research Manager
Answer: A
Question #10 (Topic: Exam A)
When troubleshooting Slurm job scheduling issues, a common source of problems is jobs getting stuck in a pending state indefinitely.
Which Slurm command can be used to view detailed information about all pending jobs and identify the cause of the delay?
A. scontrol B. sacct C. sinfo
Answer: A
Download Exam
Page: 2 / 14
Total 66 questions