Need set of C utilities compiled for the Mac
New here? Learn about Bountify and follow @bountify to get notified of new bounties! x

I need a small set of command-line utilities for analyzing Go games to be compiled for the Mac. The open-source files include documentation for each utility and a Make file, which should speed up the process. The project files can be found at the author’s website:

   https://homepages.cwi.nl/~aeb/go/sgfutils/
   File:  sgfutils-0.25.tgz

Important notes:

1) Go games (and collections of them) are packaged into .sgf files. An example .sgf file with 10 games can be downloaded at:
https://drive.google.com/open?id=1LDrt-IFtnxyaG9Sbfo8brh0tKuAcK9-Z

2) The specification for the SGF format can be found here: http://www.red-bean.com/sgf/

3) I’m particularly interested in the sgfsplit utility to split a collection of games into separate .sgf files. I do need one modification to this utility:

Instead of assigning default names to the files like “X-1”, “X-2”, etc., I need to have the utility pick out the actual name of each game from its “GN” tag. For example, the output file name for the game below should be called “Game_1.sgf”.

   (;GM[1]FF[4]SZ[19]AP[SmartGo Kifu:3.1.2]CA[UTF-8]
   GN[Game 1]
   PW[AlphaGo]
   . . . . .

4) The author mentions that all the files are standard C, but has one caution: the sgfinfo.c file has an #include for an MD5 signature algorithm. So, this package may need to be installed.

Good luck to the winner!

Wow, great work! Kudos to both weslly and CyteBode! I really appreciate your efforts. Bounties for both of you. @CyteBode: I was able to follow your detailed instructions and compile the code myself without any problem. Your coding changes to sgfsplit.c look very nice and worked perfectly. @weslly: I tested your compiled utilities and they work fine. I wasn’t able to get the Python script running as it ran into some kind of error. Do you know what the problem might be?
CuriousMynd 13 days ago
Python error message: iMac-27-HD-3:ten Eric$ python ./sgfsplit.py ./Tengames.sgf Traceback (most recent call last): File "./sgfsplit.py", line 24, in call([sgfsplitbin, '-x', randomprefix, fname]) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 522, in call return Popen(popenargs, *kwargs).wait() File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 710, in init errread, errwrite) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 1335, in executechild raise child_exception OSError: [Errno 2] No such file or directory
CuriousMynd 13 days ago
@CuriousMynd thanks for the tip! You can call the script directly without python before it: $ ./sgfsplit.py ./Tengames.sgf
weslly 13 days ago
awarded to CyteBode
Tags
c
mac-os-x

Crowdsource coding tasks.

2 Solutions


https://www.dropbox.com/s/ohu9telioiv0yfz/sgfutils.zip?dl=0

I didn't modify the sgfsplit binary itself but I made a small python script that parses and rename the files generated by the binary. You can call it the same way as sgfsplit, just add the .py extension to it:

./sgfsplit.py ./ten_games.sgf

It will output the files with the names specified in each file game name.


I don't have a Mac so I can't help with the compilation part.

Edit: Never mind, I dusted off my old Hackintosh and I managed to compile the project.

Here are the files: https://www.dropbox.com/s/22otuph2dlqarb3/sgfutils_bountify.zip?dl=0, compiled on OS X 10.10 Yosemite.

However, I would strongly advise against running compiled binaries from an untrusted source. Instead, you should really compile the source yourself.

You will need brew and possibly XCode, but if you don't have it, or don't want to install it, apparently you can just run xcode-select --install before installing brew.

Then run the following commands in a terminal:

brew update
brew upgrade
brew install openssl

cd /usr/local/include
ln -s ../opt/openssl/include/openssl .

If you haven't downloaded the source yet:

cd ~
wget https://homepages.cwi.nl/~aeb/go/sgfutils/sgfutils-0.25.tgz
tar -zxvf sgfutils-0.25.tgz
cd sgfutils-0.25

Then modify the Makefile as follows:

Line 20: Add the include path at the end so it becomes CFLAGS=-Wall -Wmissing-prototypes -O3 -I/usr/local/include

Line 23: Uncomment the line so it's just LDLIBS=-liconv without the leading #

Finally, overwrite sgfsplit.c with my version below and run make. It should compile with a few warning but no errors.

Here is my modified version of sgfsplit.c, as per #3:

/*
 * sgfsplit: split a game collection into individual files
 *  (and do nothing else). This works for arbitrary games
 *  written in SGF format: the fields are not interpreted.
 * Names are constructed from a counter.
 *
 * By default, the names are "X-%04d.sgf", counting from 1
 * -d#:     output counter zero padded to # digits
 * -s#:     start counting from #
 * -z:      start counting from 0
 * -x PREFIX:   set prefix to use instead of "X-"
 * -F FORMAT:   format used instead of "X-%d.sgf"
 * -p:      preserve trailing junk
 * -n:      avoid renaming the output file to the game name
 */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <stdarg.h>
#include "errexit.h"

#define DEFAULT_DIGNUM  4
#define DEFAULT_PREFIX  "X-"

#define MAXNAMLTH   255
char ofilename[MAXNAMLTH+1];
char gamename[MAXNAMLTH+1];
size_t gn_ix = 0;
int xfnct;
FILE *outf;
char *format;
int optc = 0;
int optp = 0;
int optn = 0;

static void
construct_next_filename() {
    int n;

    n = snprintf(ofilename, sizeof(ofilename), format, xfnct++);
    if (n >= MAXNAMLTH)
        errexit("output filename too long");
}

static void gn_to_fn() {
    int i=0;
    for (; i<MAXNAMLTH-3; i++) {
        char c = gamename[i];
        if (c == '\0')
            break;
        switch (c) {
            case ':':
            case '\\':
            case '/':
            case ' ':
                gamename[i] = '_';
        }
    }
    snprintf(gamename+i, 5, ".sgf");
}

static void
create_outfile() {
    construct_next_filename();
    outf = fopen(ofilename, "r");
    while (outf != NULL) {
        /* name exists already */
        warn("not overwriting existing %s", ofilename);
        fclose(outf);
        construct_next_filename();
        outf = fopen(ofilename, "r");
    }
    outf = fopen(ofilename, "w");
    if (outf == NULL)
        errexit("cannot open file %s", ofilename);
}

/* states: INIT0, INIT1, STARTED, INSIDEBRK, ESC, DONE */
int parenct, ingame, done;
static void (*state)(int);

static void state_init0(int c);
static void state_init1(int c);
static void state_started(int c);
static void state_insidebrk(int c);
static void state_insidename(int c);
static void state_escaped(int c);

static void state_init0(int c) {
    if (c == '(')
        state = state_init1;
}

static void state_init1(int c) {
    if (c == ';') {
        state = state_started;
        parenct = 1;
        ingame = 1;
        if (optc) {
            putc('(', outf);
            putc(';', outf);
        }
    } else
        state = state_init0;
}

static void state_started(int c) {
    static char last_two[3] = {'\0'};
    if (c == '(') {
        parenct++;
    }
    else if (c == '[') {
        if (strcmp(last_two, "GN") == 0) {
            state = state_insidename;
        } else {
            state = state_insidebrk;
        }
    }
    else if (c == ')') {
        parenct--;
        if (parenct == 0) {
            done = 1;
            ingame = 0;
            state = state_init0;
        }
    } else {
        last_two[0] = last_two[1];
        last_two[1] = (char)c;
    }
}


static void (*escaped_state)(int);
static void state_insidebrk(int c) {
    if (c == ']')
        state = state_started;
    if (c == '\\') {
        escaped_state = state_insidebrk;
        state = state_escaped;
    }
}

static void state_insidename(int c) {
    if (c == ']') {
        gamename[gn_ix] = '\0';
        state = state_started;
    } else {
        if (c == '\\') {
            escaped_state = state_insidename;
            state = state_escaped;
        } else if (gn_ix < MAXNAMLTH-4) {
            if (c != '\0') {
                gamename[gn_ix++] = (char)c;
            }
        }
    }
}

static void state_escaped(int c) {
    state = escaped_state;
}

#define BUFSZ   16384

static void
readsgf(char *fn) {
    char *infilename = (fn ? fn : "-");
    int c;
    FILE *f = NULL;

    if (strcmp(infilename, "-")) {
        f = freopen(fn, "r", stdin);
        if (!f)
            errexit("cannot open %s", fn);
    }

    while (1) {
        while ((c = getchar()) == '\n' || c == '\r' || c == ' ') ;
        if (c == EOF)
            goto fin;
        ungetc(c, stdin);

        create_outfile();

        parenct = done = ingame = 0;
        state = state_init0;
        gn_ix = 0;

        while (!done) {
            c = getchar();
            if (c == EOF)
                goto eof;
            if (ingame || !optc)
                putc(c, outf);
            (*state)(c);
        }

        if (!optn) {
            if (gn_ix > 0) {
                gn_to_fn();
                FILE* fg = fopen(gamename, "r");
                if (fg) {
                    warn("already exists: %s", gamename);
                    fclose(fg);
                } else {
                    rename(ofilename, gamename);
                }
            } else {
                warn("GN not found in sgf %d", xfnct-1);
            }
        }


        putc('\n', outf);
        fclose (outf);
    }

eof:
    /* no game seen, just trailing garbage - throw it out? */
    fclose(outf);
    if (optp)
        warn("warning: only trailing junk in %s", ofilename);
    else {
        unlink(ofilename);
        if (!optc)
            warn("trailing junk discarded");
    }
    return;

fin:
    if (f)
        fclose(f);
}

/* try to avoid crashes from silly format strings */
static void
check_format() {
    int ct = 0, m;
    char *s = format, *se;

    /* need a single %d or %u */
    while (*s) {
        if (*s++ != '%')
            continue;
        if (*s == '%') {
            s++;
            continue;
        }
        m = strtoul(s, &se, 10);
        if (*se == '$')
            errexit("unsupported %N$-construction in format");

        /* flag characters - note: .I are nonstandard*/
        while (*s && index("#0- +.I", *s))
            s++;

        /* field width */
        if (*s == '*')
            errexit("unsupported *-width in format");
        m = strtoul(s, &se, 10);
        s = se;

        /* precision */
        if (*s == '.') {
            s++;
            if (*s == '*')
                errexit("unsupported *-precision in format");
            m = strtoul(s, &se, 10);
            s = se;
        }

        /* length modifier */
        while (*s && index("hlLqjzt", *s))
            s++;

        /* format character */
        if (!*s)
            errexit("missing format character after %");
        /* c is not meaningful here */
        if (!index("diouxX", *s))
            errexit("format must use integer conversion only");
        s++;
        ct++;
        m = m;  /* for gcc */
    }
    if (ct > 1)
        errexit("format must use a single integer argument");
    if (ct == 0)
        errexit("format does not use any parameter (like %%d)");
}

static int
my_atoi(char *s) {
    unsigned long n;
    char *se;

    n = strtoul(s, &se, 10);
    if (*se)
        errexit("trailing junk '%s' in optarg", se);
    return n;
}

static void
usage() {
    fprintf(stderr, "Usage: sgfsplit [-d#] [-s#] [-z] "
        "[-x prefix] [-F format] [-c] [-p] [-n] [files]\n");
    exit(1);
}

int
main(int argc, char **argv){
    char *prefix = NULL;
    int i, opt, dignum;

    progname = "sgfsplit";
    format = NULL;
    xfnct = 1;      /* start counting from 1 */
    dignum = -1;        /* unset */

#if 1
    /* just nonsense - allows use of the I flag character */
#include <locale.h>
    if (!setlocale(LC_ALL, ""))
        warn("failed setting locale");
#endif

    while ((opt = getopt(argc, argv, "d:s:zx:F:cpn")) != -1) {
        switch (opt) {
        case 'd':
            dignum = my_atoi(optarg);
            break;
        case 's':
            xfnct = my_atoi(optarg);
            break;
        case 'z':
            xfnct = 0;
            break;
        case 'x':
            prefix = optarg;
            break;
        case 'F':
            format = optarg;
            break;
        case 'c':
            optc = 1;
            break;
        case 'p':
            optp = 1;
            break;
        case 'n':
            optn = 1;
        default:
            usage();
        }
    }

    if (format && prefix) {
        warn("warning: format overrides prefix");
        prefix = NULL;
    }
    if (format && dignum >= 0) {
        warn("warning: format overrides digwidth");
        dignum = 0;
    }
    if (!format) {
        char formatbuf[100], *p;

        if (!prefix)
            prefix = DEFAULT_PREFIX;
        if (dignum < 0)
            dignum = DEFAULT_DIGNUM;

        p = formatbuf;
        p += sprintf(p, "%s%%", prefix);
        if (dignum)
            p += sprintf(p, "0%d", dignum);
        p += sprintf(p, "d.sgf");
        format = strdup(formatbuf);
        if (!format)
            errexit("out of memory");
    }

    check_format();

    if (optind == argc) {
        readsgf(NULL);      /* read stdin */
    } else for (i=optind; i<argc; i++) {
        readsgf(argv[i]);
    }

    return 0;
}

Basically, I modified the state machine so it could parse the game name if there is a GN node and it isn't empty. Then, it gets turned into a filesystem-friendly filename with the .sgf extension and the program tries to rename the original filename to that. If the new filename already exists, it prints a warning and simply keeps the original filename.

I also added a -n flag that makes the program act the same as before by foregoing the renaming step. Furthermore, I changed the MAXNAMLTH constant from 4096 to 255. As far as I know, Unix-like OSes only allow filenames that have a length of up to 255 characters. It's only paths that can go up to 4096 characters.

I tried to keep the coding style the same, and it compiles fine on Linux without any warning, so it should hopefully compile on OS X as well. Edit: And it does!

View Timeline