Skip to content

Strings

IllidanS4 edited this page Mar 27, 2018 · 12 revisions

This plugin introduces dynamically allocated strings (of cells). These strings are manipulated using their addresses in memory (pointers), tagged either String or GlobalString (more on the difference later). It is also possible to pass such a string to (almost) any native function without intermediate copying of the characters.

Usage

The standard way of creating such a string is using @ (an alias of str_new_static) on an array of characters:

new String:str = @("Dynamic string");

This string exists regardless of its source, and its data is not bound to any AMX machine or script. It can be passed to any functions, or returned from functions:

stock String:GetHalf(StringTag:str)
{
    return str_sub(str, 0, str_len(str)/2);
}

Strings (with one exception) are mutable. This means that some functions could modify the characters of the string, and you have to know which functions are modifying and which create a new string. str_sub creates a new string, but the GetHalf function could be also written to modify the string:

stock String:MakeHalf(StringTag:str)
{
    return str_del(str, str_len(str)/2);
}

Instead of returning a substring, str_del deletes the second half of the string and returns the same string instance it was provided with. Therefore, this code is correct:

new String:str1 = @("Hello");
new String:str2 = GetHalf(str1); // creates a new instance
assert(str1 == @("Hello")); // by-value equality
assert(str2 == @("He"));
new String:str3 = MakeHalf(str1); // keeps the same instance
assert(_:str1 == _:str3); // by-reference equality (identity)
assert(str3 == @("He"));

For convenience, there are three operators defined on dynamic strings: + (concatenation, routed to str_cat), == (by-value equality, routed to str_equal), and %. Using other operators on strings is strictly prohibited since it's most likely a mistake.

Because the Pawn compiler does some reordering to the arguments of + if they are not all the same tag, % has to be used if this happens. Integers and floats can be also converted to strings automatically if they are used in a string position, but this behaviour can be disabled by defining PP_NO_AUTO_STRINGS. % can concatenate strings with values by default.

You can copy the contents of the string easily back to a buffer:

new String:str = @("Dynamic string");
new buffer[16];
str_get(str, buffer);
print(buffer);

Before 0.4, @ used to be an alias of str_new, but now it uses str_new_static. The difference is in how it obtains the length of the string. str_new looks for the null character in the string to mark the end, but str_new_static gets the length of the array in a parameter and uses that. Therefore it is also possible to get the null character in a string: @("aa\0bb").

Native interoperability

Almost any native function can be changed so that instead of taking a string as a character array, it takes a dynamic string instead. Let's start with a simple function like print:

native print(const string[]);

The native function expects an address of a string inside the AMX machine's memory. However, this plugin enables you to pass it an address outside the machine and it will interpret it as a string if possible. The modification is simple:

native print_s(AmxString:string) = print;

The tag must be AmxString instead of String because the address itself must be relative to the abstract machine's memory. Internally, str_addr is called for the conversion, which returns the offset address. The = allows changing the name of a native in the script but still refering to the same function.

new String:str1 = @("Hello ");
new String:str2 = @("world!");
print_s(str1+str2);

Unfortunately, printf cannot be modified in such a way, because it doesn't use the standard AMX API to access its parameters. For variadic functions (with ...), the conversion to AmxString is not done automatically and must be done manually:

native CallLocalFunctionStr(const function[], const format[], {AmxString,Float,_}:...) = CallLocalFunction;

public OnFilterScriptInit()
{
    new String:str1 = @("Hello ");
    new String:str2 = @("world!");
    CallLocalFunctionStr(#StringReceiver, "s", str_addr(str1+str2));
}

forward StringReceiver(str[]);
public StringReceiver(str[])
{
    print(str);
}

If str_addr hadn't been used, the compiler would issue a warning, but wouldn't attempt to convert the value. This is also the second way to extract the contents of a dynamically allocated string, one which doesn't require to know the size of the buffer.

Using dynamic strings is safe if the function uses the string as its input and does't modify the contents (usually coupled with const in the declaration), but it is also possible to use them as buffers for functions that modify the contents.

Converting these functions is not always simple or consistent, because while standard Pawn functions and plugins use amx_GetAddr, SA-MP doesn't use it for output strings, and it computes the pointer from the address directly, without any checks. "Well-behaved" functions can be converted in the standard way, but the size of the string must be taken into account:

native strcat_s_impl(AmxString:dest, const source[], maxlength) = strcat;
stock strcat_s(StringTag:dest, const source[])
{
    return strcat_s_impl(dest, source, str_len(dest) + 1);
}

The correct size is indeed str_len(dest) + 1, because the actual buffer includes the null character. Unfortunately, calling this produces an error, since strcat itself checks the validity of the address it obtains. This cannot be circumvented without relying on the memory layout of the executable, but only the standard library functions do this.

Normal SA-MP functions access the address directly, which means that the address of the actual character data must be passed. This is represented by AmxStringBuffer::

native GetPlayerNameStrImpl(playerid, AmxStringBuffer:name, len) = GetPlayerName;
stock String:GetPlayerNameStr(playerid)
{
    new String:str = str_new_buf(MAX_PLAYER_NAME);
    str_resize(str, GetPlayerNameStrImpl(playerid, str, MAX_PLAYER_NAME));
    return str;
}

GetPlayerNameStrImpl can be called directly, but for convenience, a function that creates the output string automatically should be used. str_new_buf creates a new empty string and sets its size to size - 1. The conversion to AmxStringBuffer: returns the address of the characters, which is guaranteed to be a block of memory of at least size bytes (including the null character). GetPlayerName then writes directly into the buffer, and returns the number of characters written, which is then used to truncate the string (null characters aren't used to determine the size of dynamic strings).

Since AmxStringBuffer: is the actual address of the characters, you can add bytes to it to produce a pointer into the middle of the string. Currently, this relies on the number of bytes, but it may be changed to the number of cells in the future, so it should not be relied upon:

new String:str = @("My name is _______________________");
GetPlayerNameStrImpl(playerid, str_buf_addr(str)+44, MAX_PLAYER_NAME);
print_s(str);

The null string

There is a special string value, STRING_NULL, which is an immutable special string that can be used in all functions but is always empty and not modifiable. It also has a special behaviour when used as an argument for variadic functions:

public OnFilterScriptInit()
{
    CallLocalFunctionStr(#Func, "s", str_addr(STRING_NULL));
}

forward Func(str[]);
public Func(str[])
{
    printf("%d", str[0]); //1
}

Since these functions generally crash when passed an empty string, when STRING_NULL is passed to them, is is converted to "\1;" instead of an empty string.

String lifetime and garbage collection

Pawn doesn't provide the necessary tools to track a lifetime of a value, so there were two real options when deciding how to handle the lifetime of dynamic strings – either no garbage collection and risk the possibility of leaking memory, or aggressive garbage collection and risk dangling pointers to collected strings. The latter was chosen.

A standard (local) string is collected as soon as the top-level callback ends. This means that when, for example, the code is entered via OnPlayerConnected, there can be any number of nested callbacks (via CallLocalFunction/CallRemoteFunction for example), and strings can be created and used at will there, but as soon as OnPlayerConnected returns, all local strings are collected and unusable. Therefore you can safely return string from any function without worrying it would be collected.

(Note: If pawn_register_callback is used, strings created in any handler will not be usable in subsequent handlers, since a handler is a top-level callback.)

However, more persistent lifetime for strings was also needed, and thus global strings were introduced. These strings are not garbage collected, and are intended for use in global (or static) variables, PVars or anywhere a variable lasts longer than a server tick (however, ticks are not used to trigger garbage collection; the exit from a public function is).

A global string can be easily created by assigning a value with the String tag to a variable with the GlobalString tag:

public OnFilterScriptInit()
{
    new GlobalString:str = @("Persistent string");
    SetTimerEx(#OnTimer, 1000, false, "d", _:str);
}

forward OnTimer(GlobalString:str);
public OnTimer(GlobalString:str)
{
    print_s(str);
    str_free(str);
}

The assignment hides a call to str_to_global which moves the string from the local pool into the global pool (no copying involved). Once there, the string will last for eternity unless either freed (using str_free) or moved back to the local pool (using str_to_local) and collected. Therefore, OnTimer may be equivalently designed like this:

forward OnTimer(String:str);
public OnTimer(String:str)
{
    str_to_local(str);
    print_s(str);
}

Notice the change of tag from GlobalString to String. A variable tagged String can also hold a GlobalString, and this assignment will not move the string to the local pool. Since the call to str_to_local works with the actual string instance, leaving the GlobalString tag on could confuse following calls and assignments.

It is recommended to use the second approach if possible to avoid the danger of forgetting to free the string in all of the branches of the functions. It makes no significant performance issue.

Null characters

Strings in Pawn (and SA-MP) are null-terminated, meaning that the string end is located at the first zero cell. There are two ways to get around this problem – store the length together with the string, or use another character:

new str[] = "A\256;bit longer";
printf("%d %s", strlen(str), str); //12 A

256 (0x100) does not fit into a byte, and so it is truncated into a null character when displayed, but functions like strlen check cells and not bytes. By default, str_new respects the original cells of the string and does not change them:

new String:str = str_new("A\256;bit longer");
printf("%d %d", str_len(str), str_getc(str, 1)); //12 256

In some cases, you might want to represent the string how it was intended (i.e. with a proper null character. In this case, use str_truncate as the second argument to str_new:

new String:str = str_new("A\256;bit longer", str_truncate);
printf("%d %d", str_len(str), str_getc(str, 1)); //12 0

str_truncate will truncate all cells to a single byte. However, there is now a problem in SA-MP functions that use amx_StrLen to compute the length of the string. To fix it, this plugin also hooks the function, but since this affects almost any call to a native function taking a string, you might want to disable the hook:

native strlen_s(AmxString:string) = strlen;

public OnFilterScriptInit()
{
    new String:str = str_new("A\256;bit longer", str_truncate);
    printf("%d", strlen_s(str)); //12
    pp_hook_strlen(false);
    printf("%d", strlen_s(str)); //1
}

Since 0.4, the recommended way is to use @/str_new_static on string literals and automatically sized const arrays, and str_new on arrays where the length of the string may not match the size of the array.

Strings as arrays and arrays as strings

Since dynamic strings can hold any number of any cells, they can effectively also store standard arrays, albeit they are not easily accessed:

enum STRUCT
{
    S_FIELD1,
    Float:S_FIELD2,
    S_FIELD3[16]
}

public OnFilterScriptInit()
{
    new data[STRUCT];
    data[S_FIELD1] = -1729;
    data[S_FIELD2] = 1.618034;
    data[S_FIELD3] = "abcdefghijklmno";
    
    new String:str = str_new_arr(data[STRUCT:0], _:STRUCT);
    printf("%d", str_getc(str, 0)); //-1729
    
    new data2[_:STRUCT + 1];
    str_get(str, data2);
    
    print(data2[_:S_FIELD3]); //abcdefghijklmno
}
Clone this wiki locally