'How to convert camel case to snake case with two capitals next to each other
I am trying to convert camel case to snake case.
Like this:
"LiveKarma"
-> "live_karma"
"youGO"
-> "you_g_o"
I cannot seem to get the second example working like that. It always outputs as 'you_go' . How can I get it to output 'you_g_o'
My code:
(Regex.Replace(line, "(?<=[a-z0-9])[A-Z]", "_$0", RegexOptions.Compiled)).ToLowerInvariant()
Solution 1:[1]
Here is an extension method that transforms the text into a snake case:
using System.Text;
public static string ToSnakeCase(this string text)
{
if(text == null) {
throw new ArgumentNullException(nameof(text));
}
if(text.Length < 2) {
return text;
}
var sb = new StringBuilder();
sb.Append(char.ToLowerInvariant(text[0]));
for(int i = 1; i < text.Length; ++i) {
char c = text[i];
if(char.IsUpper(c)) {
sb.Append('_');
sb.Append(char.ToLowerInvariant(c));
} else {
sb.Append(c);
}
}
return sb.ToString();
}
Put it into a static class somewhere (named for example StringExtensions
) and use it like this:
string text = "LiveKarma";
string snakeCaseText = text.ToSnakeCase();
// snakeCaseText => "live_karma"
Solution 2:[2]
Since the option that converts abbreviations as separate words is not suitable for many, I found a complete solution in the EF Core codebase.
Here are a couple of examples of how the code works:
TestSC -> test_sc
testSC -> test_sc
TestSnakeCase -> test_snake_case
testSnakeCase -> test_snake_case
TestSnakeCase123 -> test_snake_case123
_testSnakeCase123 -> _test_snake_case123
test_SC -> test_sc
I rewrote it a bit so you can copy it as a ready-to-use string extension:
using System;
using System.Globalization;
using System.Text;
namespace Extensions
{
public static class StringExtensions
{
public static string ToSnakeCase(this string text)
{
if (string.IsNullOrEmpty(text))
{
return text;
}
var builder = new StringBuilder(text.Length + Math.Min(2, text.Length / 5));
var previousCategory = default(UnicodeCategory?);
for (var currentIndex = 0; currentIndex < text.Length; currentIndex++)
{
var currentChar = text[currentIndex];
if (currentChar == '_')
{
builder.Append('_');
previousCategory = null;
continue;
}
var currentCategory = char.GetUnicodeCategory(currentChar);
switch (currentCategory)
{
case UnicodeCategory.UppercaseLetter:
case UnicodeCategory.TitlecaseLetter:
if (previousCategory == UnicodeCategory.SpaceSeparator ||
previousCategory == UnicodeCategory.LowercaseLetter ||
previousCategory != UnicodeCategory.DecimalDigitNumber &&
previousCategory != null &&
currentIndex > 0 &&
currentIndex + 1 < text.Length &&
char.IsLower(text[currentIndex + 1]))
{
builder.Append('_');
}
currentChar = char.ToLower(currentChar, CultureInfo.InvariantCulture);
break;
case UnicodeCategory.LowercaseLetter:
case UnicodeCategory.DecimalDigitNumber:
if (previousCategory == UnicodeCategory.SpaceSeparator)
{
builder.Append('_');
}
break;
default:
if (previousCategory != null)
{
previousCategory = UnicodeCategory.SpaceSeparator;
}
continue;
}
builder.Append(currentChar);
previousCategory = currentCategory;
}
return builder.ToString();
}
}
}
You can find the original code here: https://github.com/efcore/EFCore.NamingConventions/blob/main/EFCore.NamingConventions/Internal/SnakeCaseNameRewriter.cs
UPD 27.04.2022:
Also, you can use Newtonsoft library if you're looking for a ready to use third party solution. The output of the code is the same as the code above.
// using Newtonsoft.Json.Serialization;
var snakeCaseStrategy = new SnakeCaseNamingStrategy();
var snakeCaseResult = snakeCaseStrategy.GetPropertyName(text, false);
Solution 3:[3]
Simple Linq based solution... no idea if its faster or not. basically ignores consecutive uppercases
public static string ToUnderscoreCase(this string str)
=> string.Concat((str ?? string.Empty).Select((x, i) => i > 0 && i < str.Length - 1 && char.IsUpper(x) && !char.IsUpper(str[i-1]) ? $"_{x}" : x.ToString())).ToLower();
Solution 4:[4]
using Newtonsoft package
public static string? ToCamelCase(this string? str) => str is null
? null
: new DefaultContractResolver() { NamingStrategy = new CamelCaseNamingStrategy() }.GetResolvedPropertyName(str);
public static string? ToSnakeCase(this string? str) => str is null
? null
: new DefaultContractResolver() { NamingStrategy = new SnakeCaseNamingStrategy() }.GetResolvedPropertyName(str);
Solution 5:[5]
pseudo code below. In essence check if each char is upper case, then if it is add a _
, then add the char to lower case
var newString = s.subString(0,1).ToLower();
foreach (char c in s.SubString(1,s.length-1))
{
if (char.IsUpper(c))
{
newString = newString + "_";
}
newString = newString + c.ToLower();
}
Solution 6:[6]
RegEx Solution
A quick internet search turned up this site which has an answer using RegEx, which I had to modify to grab the Value
portion in order for it to work on my machine (but it has the RegEx you're looking for). I also modified it to handle null
input, rather than throwing an exception:
public static string ToSnakeCase2(string str)
{
var pattern =
new Regex(@"[A-Z]{2,}(?=[A-Z][a-z]+[0-9]*|\b)|[A-Z]?[a-z]+[0-9]*|[A-Z]|[0-9]+");
return str == null
? null
: string
.Join("_", pattern.Matches(str).Cast<Match>().Select(m => m.Value))
.ToLower();
}
Non-RegEx Solution
For a non-regex solution, we can do the following:
- Reduce all whitespace to a single space by
- using
string.Split
to split with an empty array as the first parameter to split on all whitespace - joining those parts back together with the
'_'
character
- using
- Prefix all upper-case characters with
'_'
and lower-case them - Split and re-join the resulting string on the
_
character to remove any instances of multiple concurrent underscores ("__"
) and to remove any leading or trailing instances of the character.
For example:
public static string ToSnakeCase(string str)
{
return str == null
? null
: string.Join("_", string.Concat(string.Join("_", str.Split(new char[] {},
StringSplitOptions.RemoveEmptyEntries))
.Select(c => char.IsUpper(c)
? $"_{c}".ToLower()
: $"{c}"))
.Split(new[] {'_'}, StringSplitOptions.RemoveEmptyEntries));
}
Solution 7:[7]
if you're into micro-optimaizations and want to prevent unneccessary conversions wherever possible, this one might also work:
public static string ToSnakeCase(this string text)
{
static IEnumerable<char> Convert(CharEnumerator e)
{
if (!e.MoveNext()) yield break;
yield return char.ToLower(e.Current);
while (e.MoveNext())
{
if (char.IsUpper(e.Current))
{
yield return '_';
yield return char.ToLower(e.Current);
}
else
{
yield return e.Current;
}
}
}
return new string(Convert(text.GetEnumerator()).ToArray());
}
Solution 8:[8]
There is a well maintained EF Core community project that implements a number of naming convention rewriters called EFCore.NamingConventions
. The rewriters don't have any internal dependencies, so if you don't want to bring in an EF Core related package you can just copy the rewriter code out.
Here is the snake case rewriter: https://github.com/efcore/EFCore.NamingConventions/blob/main/EFCore.NamingConventions/Internal/SnakeCaseNameRewriter.cs
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | GregorMohorko |
Solution 2 | |
Solution 3 | s0n1c |
Solution 4 | |
Solution 5 | |
Solution 6 | |
Solution 7 | realbart |
Solution 8 | satnhak |