-
Notifications
You must be signed in to change notification settings - Fork 19
Strings
This plugin introduces dynamically allocated mutable strings (of cells). These strings are manipulated using their addresses in memory (pointers), tagged either String
or ConstString
(more on the difference later). It is also possible to pass such a string to (almost) any native function without intermediate copying of the characters.
- Usage
- Native interoperability
- The null string
- String lifetime and garbage collection
- Null characters
- Strings as arrays and arrays as strings
- Packed strings
- Examples
PawnPlus supports different ways of creating dynamic strings, each made for a different purpose. The first way of creating a dynamic string is using @
(an alias of str_new_static
) on a literal string expression:
new String:str = @("Dynamic string");
str_new_static
has an additional parameter that receives the length of the string by default, so it doesn't have to be computed. This means that the string can contain null characters, but all of them are valid up to the last one (the terminating null character). Use str_new
if the string is smaller than the array that contains it.
This string exists regardless of its source, and its data is not bound to any AMX machine or script. It can be passed to any functions, or returned from functions:
stock String:GetHalf(ConstStringTag:str)
{
return str_sub(str, 0, str_len(str)/2);
}
Strings (with one exception) are always mutable. This means that functions can modify the characters of the string, so it is useful to distinguish functions that don't modify the string by using ConstString
(like you would use a const
array). str_sub
creates a new string, but the GetHalf
function could be also written to modify the string:
stock String:MakeHalf(StringTag:str)
{
return str_del(str, str_len(str)/2);
}
Instead of returning a substring, str_del
deletes the second half of the string and returns the same string instance it was provided with. Therefore, this code is correct:
new String:str1 = @("Hello");
new String:str2 = GetHalf(str1); // creates a new instance
assert(str1 == @("Hello")); // by-value equality
assert(str2 == @("He"));
new String:str3 = MakeHalf(str1); // keeps the same instance
assert(_:str1 == _:str3); // by-reference equality (identity)
assert(str3 == @("He"));
For convenience, there are three operators defined on dynamic strings: +
(concatenation, routed to str_cat
), ==
(by-value equality, routed to str_eq
), and %
(concatenation, more details below). Using other operators on strings is strictly prohibited since it's most likely a mistake.
Because the Pawn compiler does some reordering to the arguments of +
if they are not all the same tag, %
has to be used if this happens. Integers and floats can be also implicitly converted to strings automatically if they are used in a string position, but this behaviour must be enabled by defining PP_SYNTAX_STRING_OP
.
You can copy the contents of the string easily back to a buffer:
new String:str = @("Dynamic string");
new buffer[16];
str_get(str, buffer);
print(buffer);
Almost any native function can be changed so that instead of taking a string as a character array, it takes a dynamic string instead. Let's start with a simple function like print
:
native print(const string[]);
The native function expects an address of a string inside the AMX machine's memory. However, this plugin enables you to pass it an address outside the machine and it will interpret it as a string if possible. The modification is simple:
native print_s(ConstAmxString:string) = print;
The tag must be AmxString
or ConstAmxString
instead of String
because the address itself must be relative to the abstract machine's memory. Internally, str_addr
or str_addr_const
is called for the conversion, which returns the offset address. The =
allows changing the name of a native in the script but still refering to the same function.
new String:str1 = @("Hello ");
new String:str2 = @("world!");
print_s(str1+str2);
PawnPlus already defines print_s
for convenience.
Unfortunately, printf
cannot be modified in such a way, because it doesn't use the standard AMX API to access its parameters. For variadic functions (with ...
), the conversion to AmxString
is not done automatically and must be done manually:
native CallLocalFunctionStr(const function[], const format[], {AmxString,Float,_}:...) = CallLocalFunction;
public OnFilterScriptInit()
{
new String:str1 = @("Hello ");
new String:str2 = @("world!");
pp_hook_check_ref_args(true); // required for the result of str_addr to be picked
CallLocalFunctionStr(#StringReceiver, "s", str_addr(str1+str2));
}
forward StringReceiver(str[]);
public StringReceiver(str[])
{
print(str);
}
If str_addr
hadn't been used, the compiler would issue a warning, but wouldn't attempt to convert the value. This is also the second way to extract the contents of a dynamically allocated string, one which doesn't require to know the size of the buffer.
Using dynamic strings is safe if the function uses the string as its input and doesn't modify the contents (usually coupled with const
in the declaration), but it is also possible to use them as buffers for functions that modify the contents.
Converting these functions is not always simple or consistent, because while standard Pawn functions and plugins use amx_GetAddr
, SA-MP doesn't use it for output strings, and it computes the pointer from the address directly, without any checks. "Well-behaved" functions can be converted in the standard way, but the size of the string must be taken into account:
native strcat_s_impl(AmxString:dest, const source[], maxlength) = strcat;
stock strcat_s(StringTag:dest, const source[])
{
return strcat_s_impl(dest, source, str_len(dest) + 1);
}
The correct size is indeed str_len(dest) + 1
, because the actual buffer includes the null character. Unfortunately, calling this produces an error, since strcat
itself checks the validity of the address it obtains. This cannot be circumvented without relying on the memory layout of the executable, but only the standard library functions do this.
Normal SA-MP functions access the address directly, which means that the address of the actual character data must be passed. This is represented by AmxStringBuffer:
:
native GetPlayerNameStrImpl(playerid, AmxStringBuffer:name, len) = GetPlayerName;
stock String:GetPlayerNameStr(playerid)
{
new String:str = str_new_buf(MAX_PLAYER_NAME);
str_resize(str, GetPlayerNameStrImpl(playerid, str, MAX_PLAYER_NAME));
return str;
}
GetPlayerNameStrImpl
can be called directly, but for convenience, a function that creates the output string automatically should be used. str_new_buf
creates a new empty string and sets its size to size - 1
. The conversion to AmxStringBuffer:
returns the address of the characters, which is guaranteed to be a block of memory of at least size
bytes (including the null character). GetPlayerName
then writes directly into the buffer, and returns the number of characters written, which is then used to truncate the string (null characters aren't used to determine the size of dynamic strings).
Since AmxStringBuffer:
is the actual address of the characters, you can add bytes to it to produce a pointer into the middle of the string. Currently, this relies on the number of bytes, but it may be changed to the number of cells in the future, so it should not be relied upon:
new String:str = @("My name is _______________________");
GetPlayerNameStrImpl(playerid, str_buf_addr(str)+44, MAX_PLAYER_NAME);
print_s(str);
There is a special string value, STRING_NULL
, which is an immutable special string that can be used in all functions but is always empty and not modifiable (unless the modification would result in an empty string). It also has a special behaviour when used as an argument for variadic functions:
public OnFilterScriptInit()
{
CallLocalFunctionStr(#Func, "s", str_addr(STRING_NULL));
}
forward Func(str[]);
public Func(str[])
{
printf("%d", str[0]); //1
}
Since these functions generally crash when passed an empty string, when STRING_NULL
is passed to them, is is converted to "\1;"
instead of an empty string.
PawnPlus employs a mechanism similar to garbage collection for string and other objects. However, because it cannot reliably scan the memory and find if a string is used, it needs hints to know when a string is "owned", in the form of str_acquire
and str_release
. More information about this mechanism here.
Strings in Pawn (and SA-MP) are null-terminated, meaning that the string end is located at the first zero cell. There are two ways to get around this problem – store the length together with the string, or use another character:
new str[] = "A\256;bit longer";
printf("%d %s", strlen(str), str); //12 A
256 (0x100) does not fit into a byte, and so it is truncated into a null character when displayed, but functions like strlen
check cells and not bytes. By default, str_new
respects the original cells of the string and does not change them:
new String:str = str_new("A\256;bit longer");
printf("%d %d", str_len(str), str_getc(str, 1)); //12 256
In some cases, you might want to represent the string how it was intended (i.e. with a proper null character. In this case, use str_truncate
as the second argument to str_new
:
new String:str = str_new("A\256;bit longer", str_truncate);
printf("%d %d", str_len(str), str_getc(str, 1)); //12 0
str_truncate
will truncate all cells to a single byte. However, there is now a problem in SA-MP functions that use amx_StrLen
to compute the length of the string. To fix it, this plugin also hooks the function, but since this affects almost any call to a native function taking a string, you might want to disable the hook:
native strlen_s(AmxString:string) = strlen;
public OnFilterScriptInit()
{
new String:str = str_new("A\256;bit longer", str_truncate);
printf("%d", strlen_s(str)); //12
pp_hook_strlen(false);
printf("%d", strlen_s(str)); //1
}
Since dynamic strings can hold any number of any cells, they can effectively also store standard arrays, albeit they are not easily accessed:
enum STRUCT
{
S_FIELD1,
Float:S_FIELD2,
S_FIELD3[16]
}
public OnFilterScriptInit()
{
new data[STRUCT];
data[S_FIELD1] = -1729;
data[S_FIELD2] = 1.618034;
data[S_FIELD3] = "abcdefghijklmno";
new String:str = str_new_arr(data[STRUCT:0], _:STRUCT);
printf("%d", str_getc(str, 0)); //-1729
new data2[_:STRUCT + 1];
str_get(str, data2);
print(data2[_:S_FIELD3]); //abcdefghijklmno
}
Variants are more suited for storing standard (tagged) arrays, and lists and maps are better for complex objects.
Strings in Pawn can be stored as packed or as unpacked. Unpacked strings store every character in a single cell, while packed strings store them more effectively, with 4 characters in a single cell. In order for any function to determine if a string is packed or unpacked, packed characters start at the most significant byte in a cell. Therefore, a dynamic string (which is always unpacked) that starts with a cell that is negative or larger than 0xFFFFFF
will not be recognized correctly when passed to a native function.