Amazon SSM agent is using all of my i-nodes #94

andrewme2 · 2018-04-03T12:24:39Z

The SSM agent (v2.2.93 and v2.2.120) does not seem to be cleaning up files in the orchestration directory. The issue is reproducible on many EC2 instances that I have.

I have several EC2 instances where i-node usage is 90%+ and one where it is 100%. Once 100% is reached, the system effectively locks up and the ssm log files show many "no space left on device" errors.

I tracked the issue down to the SSM agent not cleaning up files in this directory:

 /var/lib/amazon/ssm/<instance-ID>/document/orchestration

For example, one of my EC2 instances has files in the orchestration directory going back to Dec. 5.

A similar issue was fixed in 2.2.24.0:

 https://github.com/aws/amazon-ssm-agent/issues/72

The text was updated successfully, but these errors were encountered:

wael-amz · 2018-04-03T18:16:14Z

We are investigating the issue.

vjt · 2018-04-18T16:28:48Z

Hi @andrewme2,

I am also experiencing this issue. As a workaround, I am using this cron job in root's crontab:

0 0 * * * find /var/lib/amazon/ssm/i-*/document/orchestration -mindepth 2 -maxdepth 2 -type d -mtime +3 -print0 | xargs -0 rm -rf

This will be executed every day at midnight and delete all orchestration documents older than 3 days.

andrewme2 · 2018-04-18T18:20:40Z

Thanks @vjt ! That's a good workaround until this bug is fixed.

kapilt · 2018-05-24T14:12:12Z

has this been resolved in a release?

ldm810 · 2018-06-11T15:15:45Z

@wael-amz Is there a workaround or updates on this issue?

wael-amz · 2018-06-27T01:46:02Z

The issue has been addressed in the latest release 2.2.800.0

andrewme2 · 2018-06-27T14:42:41Z

Hi @wael-amz , I'm trying to verify that this bug is fixed. I upgraded to 2.2.800.0. How long do I need to wait before the SSM agent will clean up old files? I do not know how often the agent runs its clean-up routine (every hour, once-per-day, etc?). After running for ~2 hours, I still have directories in the orchestration directory from May:

root@ip-X-X-X-X:~# ls -ltr /var/lib/amazon/ssm//document/orchestration | head
total 774160
drwx------ 3 root root 4096 May 1 17:59 9b1a8ca8-REDACTED
drw------- 3 root root 4096 May 1 17:59 8c7ca5d1-REDACTED
drw------- 3 root root 4096 May 1 18:02 a7a2756a-REDACTED
drwx------ 3 root root 4096 May 1 18:03 6dd8855e-REDACTED
drwx------ 3 root root 4096 May 1 18:07 28445cc3-REDACTED
drwx------ 3 root root 4096 May 1 18:07 dca845eb-REDACTED
drw------- 3 root root 4096 May 1 18:10 e2bd2589-REDACTED
drwx------ 3 root root 4096 May 1 18:10 a4e4b5e3-REDACTED
drw------- 3 root root 4096 May 1 18:11 cd02fb83-REDACTED

wael-amz · 2018-06-28T17:43:50Z

The cleanup logic will be triggered every time you run a command and it is not time based.

wael-amz closed this as completed Jun 27, 2018

88lexd mentioned this issue Jul 24, 2018

SSM agent not working with Amazon Linux 1 #107

Closed

hexsel mentioned this issue Sep 29, 2022

Amazon SSM agent is still not cleaning up the "orchestration" folder properly #471

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Amazon SSM agent is using all of my i-nodes #94

Amazon SSM agent is using all of my i-nodes #94

andrewme2 commented Apr 3, 2018

wael-amz commented Apr 3, 2018 •

edited

vjt commented Apr 18, 2018

andrewme2 commented Apr 18, 2018

kapilt commented May 24, 2018

ldm810 commented Jun 11, 2018

wael-amz commented Jun 27, 2018

andrewme2 commented Jun 27, 2018

wael-amz commented Jun 28, 2018

Amazon SSM agent is using all of my i-nodes #94

Amazon SSM agent is using all of my i-nodes #94

Comments

andrewme2 commented Apr 3, 2018

wael-amz commented Apr 3, 2018 • edited

vjt commented Apr 18, 2018

andrewme2 commented Apr 18, 2018

kapilt commented May 24, 2018

ldm810 commented Jun 11, 2018

wael-amz commented Jun 27, 2018

andrewme2 commented Jun 27, 2018

wael-amz commented Jun 28, 2018

wael-amz commented Apr 3, 2018 •

edited