Mike Kiev - Fotolia

Filter for AWS objects with these PowerShell commands for S3

Because of S3's flat structure, it can be difficult to find your way around the service. Learn how to use PowerShell S3 commands to find the right AWS objects.

There are many ways to administer workloads on AWS. The main portal is one approach, but it's often slower than using a scripted option like PowerShell. This is especially true if you have a Windows administration background.

Amazon S3 is a core AWS offerings, so admins that work on the platform will need to know their way around the object storage service. In S3, data is stored in a flat structure, and the base unit of storage is called a bucket. There isn't a hierarchy like you see on a local drive. Instead, the S3 Console infers that by using the forward slash as a delimiter in each object's name. 

In this tutorial, you'll learn how to use PowerShell scripts to filter objects in S3.

Get started with PowerShell for AWS

Install the AWS Tools for PowerShell module and set up your credentials in the user guide before you use PowerShell in Amazon S3.

Before you start to look for objects, you need to select a bucket. In PowerShell, the Get-S3Bucket cmdlet will return a list of buckets based on your credentials. The list will look something like this:

PS>Get-S3Bucket

CreationDate BucketName
------------ ----------
4/25/2019 11:07:18 AM blogdemo
4/25/2019 11:08:07 AM psgitbackup

If you have a large number of buckets, you can filter them by region using the region parameter.

Get-S3Bucket -Region us-west-2

You'll notice below I only have buckets in the U.S. West-2 region, and, therefore, all of my buckets are still displayed in this example:

PS>Get-S3Bucket -Region us-west-2

CreationDate BucketName
------------ ----------
4/25/2019 11:07:19 AM blogdemo
4/25/2019 11:08:07 AM psgitbackup

To find a list of all the AWS regions using PowerShell, simply run the Get-AWSRegion cmdlet.

Find objects directly

To retrieve information about objects in S3, use the Get-S3Object cmdlet. Be sure to specify the name of the bucket with the -BucketName parameter.

Get-S30Object -BucketName 'psgitbackup'

Now, if you run this command with just a bucket name, it will return all the objects in that bucket. If you want to narrow your results, there are a few helpful parameters you can use with the Get-S3Object cmdlet filter.

If you already know the name of the object you are looking for, you can specify it by key name, which identifies a specific object within a bucket. This value is case sensitive.

Get-S3Object -BucketName 'psgitbackup' -Key 'PSDropbox/.git/hooks/applypatch-msg.sample

This will return the desired S3 object.

PS>Get-S3Object -BucketName 'psgitbackup' -Key 'PSDropbox/.git/hooks/applypatch-msg.sample'


ETag : "ce562e08d8098926a3862fc6e7905199"
BucketName : psgitbackup
Key : PSDropbox/.git/hooks/applypatch-msg.sample
LastModified : 4/25/2019 11:11:11 AM
Owner : Amazon.S3.Model.Owner
Size : 478
StorageClass : STANDARD

The good news is that you don't have to remember all of your object's keys in S3; you can do some filtering using the same Get-S3Object cmdlet.

Filter by prefix

With the same Get-S3Object cmdlet, you can specify a prefix that only returns objects that start with a certain value. This is akin to listing the contents of a directory, but different if you have folders with similar names.

To only return objects that have the "PSDropbox" prefix -- which, in my case, is a folder -- add the trailing '/'.

Get-S3Object -BucketName 'psgitbackup' -KeyPrefix 'PSDropbox/'

This will return all objects in your S3 bucket that have the specified prefix. Keep in mind that -KeyPrefix is case-sensitive.

PS>Get-S3Object -BucketName 'psgitbackup' -KeyPrefix 'PSDropbox/'


ETag : "67e3be899c68d6ff7ac60673a6fe1681"
BucketName : psgitbackup
Key : PSDropbox/.git/COMMIT_EDITMSG
LastModified : 4/25/2019 11:10:58 AM
Owner : Amazon.S3.Model.Owner
Size : 51
StorageClass : STANDARD

Filter by delimiter

You can also use the -Delimiter parameter to sort S3 objects. The default delimiter in S3 is '/', which the console will use to define folders when looking at S3 through the web UI. It's a similar experience in PowerShell with the -Delimiter parameter.

For instance, if you wanted to list all of the objects in the root of the S3 bucket, but none that are in any subdirectories, simply use the following:

PS>Get-S3Object -BucketName psgitbackup -Delimiter '/'


ETag : "2c6b80682f93da00c3415286b4df174a"
BucketName : psgitbackup
Key : PSLog-2019-04-16.log
LastModified : 5/16/2019 11:50:38 AM
Owner : Amazon.S3.Model.Owner
Size : 1054836
StorageClass : STANDARD

ETag : "ca7495a011c5e9a33183bffec5ae415f"
BucketName : psgitbackup
Key : PSLog-2019-04-17-1.log
LastModified : 5/16/2019 11:50:36 AM
Owner : Amazon.S3.Model.Owner
Size : 114532
StorageClass : STANDARD

ETag : "d54d7bc536dd03a33da4ef598b4709c6"
BucketName : psgitbackup
Key : PSLog-2019-04-17.log
LastModified : 5/16/2019 11:50:38 AM
Owner : Amazon.S3.Model.Owner
Size : 1449299
StorageClass : STANDARD

You can see that there are only three objects in the root of my S3 bucket. I can even verify that by looking at the console.

delimiter in console

The S3 module in PowerShell doesn't return any folders that were also in the root. You can see them in the above screenshot, but they aren't returned by default with the Get-S3Object cmdlet. To retrieve those folders, use this command right after the previous one.

$AWSHistory.LastCommand.Responses.Last.CommonPrefixes

This will return the "CommonPrefixes" -- in this case, the folders that weren't returned by default.

PS>$AWSHistory.LastCommand.Responses.Last>CommonPrefixes
PSAirTable/
PSDropBox/
PSSherpaDesk/
PS_PDFGeneratorAPI/
PoShPal/
PowerTrello/
PowereBay/

You can also use the -Delimiter parameter to filter out subfolders that you may not want to see. For instance, since these subfolders are all Git repositories, you might not want to see the ".git" folder in the output. So if you specify a delimiter of ".git" then they won't display.

Get-S3Object -BucketName 'psgitbackup' -Delimiter '.git'

As the command count below demonstrates, filtering out subfolders can significantly reduce the output.

PS>(Get-S3Object -BucketName 'psgitbackup' -Delimiter '.git').count
180
PS>(Get-S3Object -BucketName 'psgitbackup').count
1702

Combine prefixes and delimiters

If you combine both prefixes and delimiters, you can easily traverse an S3 bucket similar to how you traverse a file system.

For instance, to find all the objects in a subdirectory, run the following:

Get-S3Object -BucketName psgitbackup -Prefix 'PSDropbox/src/' -Delimiter '/'

Note the trailing '/' on the -Prefix parameter. Without this, the command won't return anything.

PS>Get-S3Object -BucketName psgitbackup -Prefix 'PSDropbox/src/' -Delimiter '/'


ETag : "313ac20f13ee65e47e3d395b67d965e6"
BucketName : psgitbackup
Key : PSDropbox/src/PSDropbox.psd1
LastModified : 4/25/2019 11:10:59 AM
Owner : Amazon.S3.Model.Owner
Size : 7756
StorageClass : STANDARD

Then you can run CommonPrefixes again to see what folders or directories are also present.

PS>$AWSHistory.LastCommand.Responses.Last.CommonPrefixes
PSDropbox/src/Private/
PSDropbox/src/Public/

Next Steps

How to create an EC2 instance using PowerShell

Dig Deeper on AWS infrastructure