Cleaning downloaded filenames of invalid characters

A friend is doing a project where he is downloading files from the internet using powershell. Well files in a Unix system can have lots of characters you cant use in a Window systems. So what kind of characters could that be? Backslashes \, slashes / and many more, in all a lot of characters. So lets try to do a list of all invalid characters, well I think we will miss some. Lets get Windows to tell us.


Powershell .NET function call to list invalid filename characters

So now we know what characters that’s invalid, but we need clean filenames in an easy way.

First lets just create a simple test string:

$badstring = '\asdv/ [space] dsa15:*dasdas/<>'

I have decided to use a simple regexp -replace. So lets join the array into a simple string:

$illegalchars = [string]::join('',([System.IO.Path]::GetInvalidFileNameChars()))

Now we have a string containing all invalid characters, so we can use it? Not really yet. Due to the fact since I decided to do a regexp -replace, I need to consider that backslashes are an escape character. So we need to escape all backslashes with an extra backslash.

$illegalchars2 = [string]::join('',([System.IO.Path]::GetInvalidFileNameChars())) -replace '\\','\\'

Now we can try it out:

invalid removal script

So there we have it. A simple way to remove invalid filename characters from a string. So that you can create files without the risk of creating invalid filenames.

