'Regex expression for parsing SQL Server schema and tablename

I am trying to parse schema and table name using a regular expression. I am using C# (.NET Core) and System.Text.RegularExpressions.

Here are examples how the input can look like:

Schema.TableName
Schema . TableName
[Schema].[TableName]
[Schema] . [TableName]
Schema. [TableName]
[SchemaPart1.Part2.Part  3].[Table . Name]

And of course other variations that SQL Server accepts when you want to create a table using table and schema name.

Also, I am supporting other databases (MySql & Postgres), so I will reuse the same regex, but replace the [] with the right quotation character (` or " ). But I used the Sql Server examples because the start and ending quotation character are different.

My current regular expression looks like this:

\.?\[.+?\]|[^\[]+?(?=\.)|[^\[]+

This does work most of the times. But e.g. in this case [Schema] . [TableName] I get a match for whitespaces in between which I am not quite sure how to filter out.

Any help would be appreciated.



Solution 1:[1]

One possible way is to match schema, table and delimiters at once and use capturing groups to get the names separated from surrounding fluff. For example:

`/^(\[[^]]+]|\w+) *\. *(\[[^]]+]|\w+)/gm`

See regex demo.

C# example:

using System;
using System.Text.RegularExpressions;
// ...
String text = "Schema.TableName\nSchema . TableName\n[Schema].[TableName]\n[Schema] . [TableName]\nSchema. [TableName]\n[SchemaPart1.Part2.Part  3].[Table . Name]";
Regex re = new Regex(@"^(\[[^]]+\]|\w+) *\. *(\[[^]]+\]|\w+)", RegexOptions.Multiline);
Match match = re.Match(text);

while (match.Success) {
    Console.WriteLine("( '{0}', '{1}' )", match.Groups[1].Value, match.Groups[2].Value);
    match = match.NextMatch();
}

Output:

'Schema', 'TableName'
'Schema', 'TableName'
'[Schema]', '[TableName]'
'[Schema]', '[TableName]'
'Schema', '[TableName]'
'[SchemaPart1.Part2.Part  3]', '[Table . Name]'

OR

If you want to match schema or table without differentiating between them, you could try a variation of your expression. Be advised that this would also match simple strings like foo. For example:

`/\.? *(?:\[[^]]+\]|\w+)/g`

See regex demo.

C# example (using and text same as above):

MatchCollection matches = Regex.Matches(text, @"\.? *(?:\[[^]]+\]|\w+)");
foreach (Match m in matches) Console.WriteLine("'{0}'", m);

Output:

'Schema'
'.TableName'
'Schema'
'. TableName'
'[Schema]'
'.[TableName]'
'[Schema]'
'. [TableName]'
'Schema'
'. [TableName]'
'[SchemaPart1.Part2.Part  3]'
'.[Table . Name]'

Solution 2:[2]

or with labelling the parts , and removing the brackets

(?:(?:\[(?<server>[^\]]*)\]|(?<server>[^\.]*))\.)?(?:(?:\[(?<catalog>[^\]]*)\]|(?<catalog>[^\.]*))\.)?(?:(?:\[(?<schema>[^\]]*)\]|(?<schema>[^\.]*))\.)?(?:\[(?<object>[^\]]*)\]|(?<object>[^\.]*))$

!this needs righttoleft parsing

public static Regex regex = new Regex(
"(?:(?:\\[(?<server>[^\\]]*)\\]|(?<server>[^\\.]*))\\.)?(?:(?" +
":\\[(?<catalog>[^\\]]*)\\]|(?<catalog>[^\\.]*))\\.)?(?:(?:\\[" +
"(?<schema>[^\\]]*)\\]|(?<schema>[^\\.]*))\\.)?(?:\\[(?<objec" +
"t>[^\\]]*)\\]|(?<object>[^\\.]*))$",
RegexOptions.IgnoreCase | RegexOptions.Singleline 
| RegexOptions.CultureInvariant | RegexOptions.Compiled
| RegexOptions.RightToLeft );

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 oriberu
Solution 2 Luc Vandenbroucke