mscript:用于脚本化命令行操作的编程语言






4.79/5 (9投票s)
用友好的 mscript 替换您糟糕的 .bat 文件,以实现干净强大的命令行操作
引言
mscript 最初被设计为一种教学语言。但这并没有成功。
所以,我着手让 mscript 对以下人群有用:
- 懂得如何编程的人
- 对脚本化命令行操作感兴趣的人
- 出于任何原因不想使用 Powershell 或 Python 来完成任务的人
想法是,这里有一个简单的脚本语言,如果它能解决你的问题,那就无需动用更强大的工具。
最初用 C# 编写,包含各种 HTML、IDE、服务器等。我把它们都砍掉了!我将其移植到 C++,并使其在内部和外部都更加合理。只需一个 800 KB 的 EXE 脚本解释器,没有外部依赖,您就能获得巨大的价值。只需将其添加到您的 PATH 中即可!
mscript 语言
这里有一个 mscript 的简要概览
! Caching Fibonacci sequence
~ fib(n)
! Check the cache
? fib_cache.has(n)
<- fib_cache.get(n)
}
! Compute the result
$ fib_result
? n <= 0
& fib_result = 0
? n = 1 || n = 2
& fib_result = 1
<>
& fib_result = fib(n - 1) + fib(n - 2)
}
! Stash the result in the cache
* fib_cache.add(n, fib_result)
! All done
<- fib_result
}
! Our cache is an index, a hash table, any-to-any
$ fib_cache = index()
! Print the first 10 values of the Fibonacci series
! Look, ma! No keywords!
# n : 1 -> 10
> fib(n)
}
它是一种基于行、伪面向对象的脚本语言,使用符号而不是关键字。
它不关心空格。没有分号。
对象
在 mscript 中,每个变量都包含一个对象(可以将其想象为 .NET 的 Object
,但更像 VB6 的 VARIANT
)。
一个对象可以是六种类型的事物之一:
- null
- 数字 - 双精度浮点数
- 字符串 - std::wstring
- bool
- 列表 - std::vector<object>
- 索引 - std::map<object, object>,保持插入顺序,一个向量映射
列表和索引通过引用复制,其余通过值复制
mscript 语句
/* a block
comment
*/
! a single-line comment, on its own line, can't be at the end of a line
> "print the value of an expression, like this string, including pi: " + round(pi, 4)
>> print exaclty what is on this line, allowing for any "! '= " 0!')* nonsense you'd like
{>>
every line
in "here"
is printed "as-is"
>>}
! Declare a variable with an optional initial value
! With no initial value, the variable has the null value
$ new_variable = "initial value"
! A variable assignment
! Once a variable has a non-null value, the variable cannot be assigned
! to a value of another type
! So mscript is somewhat dynamic typed
& new_variable = "some other value"
! The O signifies an unbounded loop, a while(true) type of thing
! All loops end in a closing curly brace, but do not start with an opening one
O
...
! the V statement is break
> "gotta get out!"
V
}
! If, else if, else
! No curly braces at ends of each if or else if clause,
! just at the end of the overall statement
? some_number = 12
& some_number = 13
? some_number = 15
& some_number = 16
<>
& some_number = -1
}
! A foreach loop
! list(1, 2, 3) creates a new list with the given items
! This statements processes each list item, printing them out
! Note the string promotion in the print line
@ item : list(1, 2, 3)
> "Item: " + item
}
! An indexing loop
! Notice the pseudo-OOP of the my_list.length() and my_list.get() calls
! This is syntactic sugar for calls to global functions,
! length(my_list) and get(my_list, idx)
$ my_list = list(1, 2, 3)
# idx : 0 -> my_list.length() - 1
> "Item: " + my_list.get(idx)
}
{
! Just a little block statement for keeping variable scopes separate
! Variables declared in here...
}
! ...are not visible out here
! Functions are declared like other statements
~ my_function (param1, param2)
! do something with param1 and param2
! Function return values...
! ...with a value
<- 15
! ...without a value
<-
}
! A little loop example
~ counter(low_value, high_value
$ cur_value = low_value
$ counted = list()
O
! Use the * statement to evaluate an expression and discard its return value
! Useful for requiring deliberate ignoring of return values
* counted.add(cur_value)
& cur_value = cur_value + 1
? cur_value > high_value
! Use the V statement to leave the loop
V
<>
! Use the ^ statement to go back up to the start of the loop, a continue statement
^
}
}
<- counted
}
! Load and run another script here, an import statement
! The script path is an expression, so you can dynamically load different things
! Scripts are loaded relative to the script they are imported from
! Scripts loaded in this way are processed just like top-level scripts,
! so they can declare global variables, define functions, and...execute script statements
! Plenty of rope...
+ "some_other_script.ms"
mscript 表达式
mscript 语句利用表达式,有些简单,有些非常强大。
Binary operators, from least to highest precedence:
or || and && != <= >= < > = % - + / * ^
Unary operators: - ! not
An expression can be:
null
true
false
number
string
dquote
squote
tab
lf
cr
crlf
pi
e
variable as defined by a $ statement
Strings can be double- or single-quoted, 'foo ("bar")' and "foo ('bar')" are valid;
this is handy for building command lines that involve lots of double-quotes;
just use single quotes around them.
String promotion:
If either side of binary expression evaluates to a string,
the expression promotes both sides to string
Bool short-circuiting:
The left expression is evaluated first
If && and left is false, expression is false
If || and left is true, expression is true
Standard math functions, for your math homework:
abs asin acos atan ceil cos cosh exp floor
log log2 log10 round sin sinh sqrt tan tanh
getType(obj) - the type of an object obj as a string
- you can also say obj.getType()
- see the shorthand?
number(val) - convert a string or bool into a number
string(val) - convert anything into a string
list(item1, item2, etc.) - create a new list with the elements passed in
index(key1, value1, key2, value2) - create a new index with the pairs of keys
and values passed in
obj.clone() - deeply clone an object, including indexes containing list values, etc.
obj.length() - C++ .size(), string or list length, or index pair count
obj.add(to_add1, to_add2...) - append to a string, add to a list, or add pairs to an index
obj.set(key, value) - set a character in a string, change the value at a key in a list or index
obj.get(key) - return the character of a string, the element in a list,
or the value for the key in an index
obj.has(value) - returns if string has substring, list has item, or index has key
obj.keys(), obj.values() - index collection access
obj.reversed() - returns copy of obj with elements reversed, including keys of an index
obj.sorted() - returns a copy of obj with elements sorted, including index keys
join(list_obj, separator) - join list items together into a string
split(str, separator) - split a string into a list of items
trim(str) - return a copy of a string with any leading or trailing whitespace removed
toUpper(str), toLower(string) - return a copy of a string in upper or lower case
str.replaced(from, to) - return a copy of a string with characters replaced
random(min, max) - return a random value in the range min -> max
obj.firstLocation(toFind), obj.lastLocation(toFind) - find the first or last location
of an a substring in a string or item in a list
obj.subset(startIndex[, length]) - return a substring of a string or a slice of a list,
with an optional length
obj.isMatch(regex) - see if a string is a match for a regular expression
obj.getMatches(regex) - return a list of matches from a regular expression applied to a string
exec(cmd_line) - execute a command line, return an index with keys
("success", "exit_code", "output")
This is the main function gives mscript meaning in life.
You build your command line, you call exec, and it returns an index with all you need to know.
Write all the script you want around calls to exec, and get a lot done.
exit(exit_code) - exit the script with an exit code
error(error_msg) - raise an error, reported by the script interpreter
readFile(file_path, encoding) - read a text file into a string, using the specified encoding,
either "ascii", "utf8", or "utf16"
writeFile(file_path, file_contents, encoding) - write a string to a text file with an encoding
一点魔力
如果函数名不是内置的,也不是用户定义的函数名,那么,如果函数名是一个变量名,并且该变量是一个字符串,那么该变量的值将成为函数名,并使用相同的参数进行执行。
So if you have...
~ addTogether(one, two)
<- one + two
}
and
~ powTogether(one, two)
<- one ^ two
}
...you can then use...
$ func = "addTogether"
$ added = func(2, 3)
! added is 5
& func = "powTogether"
$ powed = func(2 ,3)
! powed is 8
穷人的函数指针。很酷,对吧?
再一点魔力
如前所述,mscript 以 something.function(param1...) 的形式进行函数调用,通过将 something 作为第一个参数传递给函数来“面向对象”,即 function(something, param1...)。您不能链式调用这些,并且 something 必须是一个变量名,而不是任何其他类型的表达式。这种简写方式使脚本更容易阅读和编写。
如果您想在一行代码中完成大量工作,您可以随意嵌套函数
$ lines = split(trim(replaced(get(exec("dir"), "output"), crlf, lf)), lf)
值得注意的代码
popen
mscript 的核心实用功能是 exec 函数。
在 C++ 中...
// Initialze our output object, an index
object::index retVal;
retVal.set(toWideStr("success"), false);
retVal.set(toWideStr("exit_code"), -1.0);
retVal.set(toWideStr("output"), toWideStr(""));
// Run the program.
FILE* file = _wpopen(paramList[0].stringVal().c_str(), L"rt");
if (file == nullptr)
return retVal;
// Read the program's output into a string
char buffer[4096];
std::string output;
while (fgets(buffer, sizeof(buffer), file))
output.append(buffer);
retVal.set(toWideStr("output"), toWideStr(output));
// If we stopped at EOF, things went well
retVal.set(toWideStr("success"), bool(feof(file)));
// Close the "file" and get our exit code
int result = _pclose(file);
retVal.set(toWideStr("exit_code"), double(result));
file = nullptr;
// All done
return retVal;
popen 是魔术酱。调用它,读取输出,关闭文件,获取退出代码,并将所有内容返回。以这种方式使用索引对象允许多值返回... 值。
std::string <=> std::wstring 转换
我多年来一直使用像这样的字符编码代码
std::string toNarrowStr(const std::wstring& str)
{
std::wstring_convert<std::codecvt_utf8<wchar_t>, wchar_t> converter;
return converter.to_bytes(str);
}
std::wstring toWideStr(const std::string& str)
{
std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>> converter;
return converter.from_bytes(str);
}
我注意到测试脚本运行得非常慢。我发现 Visual Studio Community Edition 包含一个分析器(哇!),它显示整个程序执行时间的 60% 都花在了 toWideStr
上,特别是 converter.from_bytes(str)
。此项目的一个示例脚本调用了对大型且多样的目录结构执行 dir 操作,mscript 解释器在处理 dir 输出时崩溃了。
有些东西必须放弃。
因此,我采用了 Win32 字符转换函数来确保正确性,并使用预调用转换检查器来提高速度。很多字符串不需要特殊处理,尤其是在 mscript 中,所以我优化了这些情况。
std::wstring toWideStr(const std::string& str)
{
bool allNarrow = true;
{
const unsigned char* bytes = reinterpret_cast<const unsigned char*>(str.data());
for (size_t i = 0; i < str.size(); ++i)
{
if (bytes[i] > 127)
{
allNarrow = false;
break;
}
}
}
if (allNarrow)
{
std::wstring retVal;
retVal.reserve(str.size());
for (auto c : str)
retVal += char(c);
return retVal;
}
int needed = MultiByteToWideChar(CP_UTF8, 0, str.data(), int(str.size()), nullptr, 0);
if (needed <= 0)
raiseError("MultiByteToWideChar failed");
std::wstring result(needed, 0);
MultiByteToWideChar(CP_UTF8, 0, str.data(), int(str.size()), result.data(), needed);
return result;
}
std::string toNarrowStr(const std::wstring& str)
{
if (str.empty())
return std::string();
bool allAscii = true;
for (wchar_t c : str)
{
if (c <= 0 || c > 127)
{
allAscii = false;
break;
}
}
if (allAscii)
{
std::string retVal;
retVal.reserve(str.size());
for (auto c : str)
retVal += char(c);
return retVal;
}
int needed = WideCharToMultiByte(CP_UTF8, 0, str.data(),
int(str.size()), nullptr, 0, nullptr, nullptr);
if (needed <= 0)
raiseError("WideCharToMultiByte failed");
std::string output(needed, 0);
WideCharToMultiByte(CP_UTF8, 0, str.data(), int(str.size()),
output.data(), needed, nullptr, nullptr);
return output;
}
我没有具体数字,但性能提升是天壤之别,dir 输出得到了处理并且可用。一击必杀。
示例脚本:musicdb
此示例脚本使用 dir 对您的整个音乐库进行操作,解析出看起来像艺术家、专辑和曲目的信息。然后,您可以输入搜索词来查找感兴趣的艺术家并查看他们的目录。如果在命令行中向脚本传递了参数,脚本将这些参数作为搜索词,并静默加载和处理音乐库路径,执行搜索操作,输出结果,然后退出。
! If we have command line arguments, those are our search terms
! Handle things directly, no "UI", just results
? arguments.length() > 0
$ lines = loadLines()
$ db = index()
* processLines(db, lines, false)
$ matching_artists = getMatchingArtistNames(db, arguments)
@ matching_artist : matching_artists
* summarizeArtist(db, matching_artist)
>>
}
! Exit the script with a good exit code
<- 0
}
>> Loading music files...
$ lines = loadLines()
> "Music Files: " + lines.length()
$ db = index()
* processLines(db, lines, true)
>> All done
* outputStats(db)
O
>>
>> Enter artist name search string, as much as you'd like, however you'd like:
$ matching_artists = getMatchingArtistNames(db, split(input(), " "))
> "Matching Artists: (" + matching_artists.length() + ")"
? matching_artists.length() > 0
> matching_artists.join(lf)
}
@ matching_artist : matching_artists
* summarizeArtist(db, matching_artist)
>>
}
}
! Do the dir of the user's Music directory
! Surprisingly fast!
~ loadLines()
$ result_index = exec('dir /B /S "C:\Users\%USERNAME%\Music"')
? !result_index.get("success")
> "Running the dir command failed with exit code " + result_index.get("exit_code")
<- list()
}
<- split(replaced(result_index.get("output"), crlf, lf), lf)
}
! Walk dir output processing each line in turn
~ processLines(db, lines, should_pacify)
? should_pacify
$ line_count = 0
@ line : lines
* processLine(db, line)
& line_count = line_count + 1
? (line_count % 5000) = 0
> line_count
}
}
<>
@ line : lines
* processLine(db, line)
}
}
}
! Given a line from the dir output, add a track to our database...
! ...if it's a somewhat valid line
~ processLine(db, line)
! Split up path, bail if not at least artist\album\track
& line = line.trim()
$ parts = line.split('\')
? parts.length() < 3
! not deep enough
<- false
}
$ filename = parts.get(parts.length() - 1)
$ dot_index = filename.lastLocation('.')
? dot_index <= 0
! not a file
<- false
}
$ track = filename.subset(0, dot_index)
$ artist = parts.get(parts.length() - 3)
$ album = parts.get(parts.length() - 2)
! DEBUG
!> "Artist: " + artist + " - Album: " + album + " - Track: " + track
! Ensure the artist is in the DB
? !db.has(artist)
* db.add(artist, index())
}
$ artist_index = db.get(artist)
! Ensure the artist has the album
? !artist_index.has(album)
* artist_index.add(album, list())
}
$ album_list = artist_index.get(album)
! Add the track to the album
* album_list.add(track)
! All done
<- true
}
! Walk the database of artist collections gathering and outputting stats
~ outputStats(db)
$ artists = db.keys()
$ album_count = 0
$ track_count = 0
@ artist : artists
$ artist_index = db.get(artist)
@ album_name : artist_index.keys()
& album_count = album_count + 1
$ album_tracks = artist_index.get(album_name)
& track_count = track_count + album_tracks.length()
}
}
>>
> "Artists: " + artists.length()
? artists.length() > 0
> "Albums: " + album_count + " - albums / artist = " +
round(album_count / artists.length())
> "Tracks: " + track_count + " - tracks / album = " +
round(track_count / album_count)
}
}
! Given search terms, return the names of artists that match all terms
~ getMatchingArtistNames(db, parts)
! Normalize the input, trimmed and lowered
$ normalized_parts = list()
@ part : parts
$ normalized_part = trim(toLower(part))
? normalized_part.length() > 0
* normalized_parts.add(normalized_part)
}
}
! Walk the artists finding matches
$ matching_artists = list()
? normalized_parts.length() = 0
<- matching_artists
}
@ artist : db.keys()
$ artist_lower = toLower(artist)
$ match = true
@ part : normalized_parts
? !artist_lower.has(part)
& match = false
V
}
}
? match
* matching_artists.add(artist)
}
}
<- matching_artists
}
! Output an artist's collection
~ summarizeArtist(db, artist_name)
> "Artist: " + artist_name
$ artist_albums = db.get(artist_name)
@ album : artist_albums.keys()
> " Album: " + album
$ album_tracks = artist_albums.get(album)
@ album_track : album_tracks
> " " + album_track
}
}
}
项目布局
在附件的 source.zip 中,您将找到所有内容,包括 Visual Studio 解决方案文件。
以下是构成解决方案的项目:
mscript-lib
mscript-lib
项目实现了表达式和语句。解决方案中所有可工作的代码都包含在此项目中。
- expressions
- object
- script_processor
- 符号
- utils
- vectormap
mscript-tests
单元测试
mscript-test-runner
如果不让测试代码变得庞大且难以管理,单元测试只能做到这么多。与其编写糟糕的单元测试,不如我制作了一系列包含脚本执行和预期结果的文件。所以,在 mscript-test-scripts
中,您会找到测试文件,顶部有语句,下方有预期结果,用 ===
分隔。
mscript-test-runner
运行目录中的所有脚本并验证它是否获得了预期的结果。
mscript
这是脚本解释器。所有代码都在 mscript-lib
中,所以这只是一个围绕该项目的外壳。
mscript 程序中棘手的部分是根据 + 语句的请求加载辅助脚本。辅助脚本的路径相对于导入它的脚本。所以,如果您告诉解释器运行 c:\my_scripts\mine.ms,并且 mine.ms 导入 yours.ms,那么 yours.ms 将在 c:\my_scripts 中查找,而不是在当前目录或其他地方。
结论
以上是对 mscript 语言的深入探讨:由符号驱动的语句类型列表;具有专用函数库和一些魔力的表达式语法,以及一个实际应用的示例。
这是 mscript 的第二个版本。希望它能引起您的共鸣,并且您可以使用它来通过简单干净的 mscript 完成大型命令行任务。我很乐意收到您的反馈,所以请在评论区畅所欲言。谢谢!尽情享用!
历史
- 2022 年 2 月 7 日:初始版本