'Regex expression for parsing SQL Server schema and tablename
I am trying to parse schema and table name using a regular expression. I am using C# (.NET Core) and System.Text.RegularExpressions
.
Here are examples how the input can look like:
Schema.TableName
Schema . TableName
[Schema].[TableName]
[Schema] . [TableName]
Schema. [TableName]
[SchemaPart1.Part2.Part 3].[Table . Name]
And of course other variations that SQL Server accepts when you want to create a table using table and schema name.
Also, I am supporting other databases (MySql & Postgres), so I will reuse the same regex, but replace the [] with the right quotation character (` or " ). But I used the Sql Server examples because the start and ending quotation character are different.
My current regular expression looks like this:
\.?\[.+?\]|[^\[]+?(?=\.)|[^\[]+
This does work most of the times. But e.g. in this case [Schema] . [TableName]
I get a match for whitespaces in between which I am not quite sure how to filter out.
Any help would be appreciated.
Solution 1:[1]
One possible way is to match schema, table and delimiters at once and use capturing groups to get the names separated from surrounding fluff. For example:
`/^(\[[^]]+]|\w+) *\. *(\[[^]]+]|\w+)/gm`
See regex demo.
C# example:
using System;
using System.Text.RegularExpressions;
// ...
String text = "Schema.TableName\nSchema . TableName\n[Schema].[TableName]\n[Schema] . [TableName]\nSchema. [TableName]\n[SchemaPart1.Part2.Part 3].[Table . Name]";
Regex re = new Regex(@"^(\[[^]]+\]|\w+) *\. *(\[[^]]+\]|\w+)", RegexOptions.Multiline);
Match match = re.Match(text);
while (match.Success) {
Console.WriteLine("( '{0}', '{1}' )", match.Groups[1].Value, match.Groups[2].Value);
match = match.NextMatch();
}
Output:
'Schema', 'TableName'
'Schema', 'TableName'
'[Schema]', '[TableName]'
'[Schema]', '[TableName]'
'Schema', '[TableName]'
'[SchemaPart1.Part2.Part 3]', '[Table . Name]'
OR
If you want to match schema or table without differentiating between them, you could try a variation of your expression. Be advised that this would also match simple strings like foo
. For example:
`/\.? *(?:\[[^]]+\]|\w+)/g`
See regex demo.
C# example (using
and text
same as above):
MatchCollection matches = Regex.Matches(text, @"\.? *(?:\[[^]]+\]|\w+)");
foreach (Match m in matches) Console.WriteLine("'{0}'", m);
Output:
'Schema'
'.TableName'
'Schema'
'. TableName'
'[Schema]'
'.[TableName]'
'[Schema]'
'. [TableName]'
'Schema'
'. [TableName]'
'[SchemaPart1.Part2.Part 3]'
'.[Table . Name]'
Solution 2:[2]
or with labelling the parts , and removing the brackets
(?:(?:\[(?<server>[^\]]*)\]|(?<server>[^\.]*))\.)?(?:(?:\[(?<catalog>[^\]]*)\]|(?<catalog>[^\.]*))\.)?(?:(?:\[(?<schema>[^\]]*)\]|(?<schema>[^\.]*))\.)?(?:\[(?<object>[^\]]*)\]|(?<object>[^\.]*))$
!this needs righttoleft parsing
public static Regex regex = new Regex(
"(?:(?:\\[(?<server>[^\\]]*)\\]|(?<server>[^\\.]*))\\.)?(?:(?" +
":\\[(?<catalog>[^\\]]*)\\]|(?<catalog>[^\\.]*))\\.)?(?:(?:\\[" +
"(?<schema>[^\\]]*)\\]|(?<schema>[^\\.]*))\\.)?(?:\\[(?<objec" +
"t>[^\\]]*)\\]|(?<object>[^\\.]*))$",
RegexOptions.IgnoreCase | RegexOptions.Singleline
| RegexOptions.CultureInvariant | RegexOptions.Compiled
| RegexOptions.RightToLeft );
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | oriberu |
Solution 2 | Luc Vandenbroucke |