Prolog: An AI program that plays Eternagame

I am writing an A-I program in the Prolog language that plays Eternagame. The aim is not to explicitly solve the puzzles but to provide a reasonable number of candidate solutions. I do not want to take the fun out of the game, but instead to use the emergent wisdom of the gaming community as I go along. I am sharing my results and source code on a google drive. Here is the link for it:

https://drive.google.com/drive/folders/1ROnT1-C_X8OE1CGoHJEMUFE45WB4Nd5T?usp=sharing

I may not have anything new for a couple of days. The last time I ran the program on this dinky laptop, it took over 40 minutes to complete. Iā€™m going to try and move it to my Z600 workstation. Itā€™s a workhorse with two Xeon chips, but itā€™s old, canā€™t be upgraded to Windows 10 and therefore it is not safe to connect it to the internet anymore. So Iā€™ll have to work there, then put the results on a flash drive and load them onto this laptop, from whence I can update the google drive.

Iā€™m hoping to hear from some of you.

3 Likes

i can give you a good computer

That sounds great. How can we work this out?

i was just joking
um sorry

Thatā€™s okay. No harm done.

This looks awesome!

Could you link to some descriptions of how to interpret Prolog and some of the basic strategies you might be implementing in the program?

Specifically, does providing candidate solutions mean all of the candidates are viable solutions? Or more that it provides jumping off points that you can use to get to the end of puzzles?

1 Like

Here is a link to the Prolog manual:
Prolog Manual
Once you get there you can also go to the download page and get the free(!) compiler so you can try it yourself. I find the manual a bit opaque, but you youngsters who were raised on Unix may find it quite readable.

There are a couple of good books on the subject: ā€œProgramming in Prologā€ by Clocksin & Mellish and ā€œThe Art of Prologā€ by Sterling and Shapiro.

Hereā€™s a very quick primer for those who havenā€™t gotten the emails I sent you:
:- means ā€˜ifā€™.
. means ā€˜orā€™.
, means ā€˜andā€™, except when itā€™s just an argument delimiter within () or [].
Variables start with an uppercase letter.
constants start with a lowercase letter or a number.
Square brackets enclose a list.
[] is the empty list.
[H|T] is a list that starts with the head, H and ends with the tail T.
Subroutines are called predicates.
A predicate may have multiple clauses, separated by commas (again meaning ā€˜andā€™).

Your second question is one that also occupies my mind. The easiest thing to do is to generate all possible combinations of bases that are not locked in the puzzle description, thus producing an astronomical number of possible sequences. To pair that down I apply rules. The first is to specify that all pairs which are bonded in the final structure consist of bond pairs (u-a, c-g and u-g) in the sequence. In some cases it is necessary to add an empty cell (e) to make things line up. The e is then deleted before output.

Then there are more heuristic limits. These include requiring strong (c,g) bonds when closing loops and requiring that (u,a) bonds alternate along the length of the RNA when possible. I will be looking for a lot more of these in the future.

So, the idea is to provide output that is more than randomly likely to succeed, is a much smaller set than all that could be imagined, but still leaves room for those ā€œoutside the boxā€ solutions, which, I believe are the main objective of the Open Vaccine project. I have added a Focus variable, which may be set to w(wide), m(medium), or n(narrow) to allow for the application of more or fewer rules during the run.

The sets for the first two puzzles are different. Of course, finding a solution to either of these is simple. However, if we take the problem at itā€™s word, then most solutions will have more than five of the required base. You canā€™t actually get to those from within the game. I think there are about 12K solutions for each one and I think that the output sets enumerate all of them. I canā€™t say that i have either counted or checked them all.

A few quick words on the google drive.
The source file is Rnafold.pl
Text (.txt) files are the raw output from the program.
File 1 through 11 are for the first eleven puzzles.
gs1 is for the next puzzle (#1 in gene synthesizer).
A ā€œ_wā€, " _m" or " _n" in the file name indicates it was done with a specific focus.
Files with the extension .rtf have been edited with comments and better readability when printed.

This is getting to be a pretty long reply, so I think I will stop now.

One of my big problems right now is that the only way I have to find out which of the ā€˜candidatesā€™ is an actual solution is to enter each sequence through the GUI of eternagame. This is pretty much impractical for more than a few sequences. I could get much more useful results if there were some way to run my output file, as a whole, through eternagame and have it tell me which ones worked.

The folding engines in Eterna are buried deep in the code (for efficiency, but also possibly for copyright reasons) and I donā€™t think that there is any way to use them to test a design automatically. But the code for most of the folding engines is available online (mostly in C++, but thereā€™s also Vienna 1 code in javascript) and so it should be possible to compile these to check your designs.

1 Like

The engines themselves are readily available from their original creators, and our custom patches and compilation instructions for usage within our game are available here: https://github.com/eternagame/EternaJS/tree/master/lib

We compile with emscripten to webassembly so that itā€™s usable within a browser, however that of course isnā€™t a necessity (theyā€™re originally built to compile as CLI apps). One thing to note though is that if you use the models directly, you wonā€™t get information on whether a design passes constraints - you just get what the model spits out (MFE, structure energies, etc).

If you want to work within the bounds of Eterna itself, you can write an in-game booster to run through the possibilities and check if it is satisfied or not.

Feel free to send questions my way on any of this - Iā€™m happy to do what I can to clarify. :slight_smile:

1 Like

Thanks so much. This may take me a while. I just got my brain reconfigured for Prolog and itā€™s been a while since I worked in C. I guess my first order is to get a C++ compiler. I suspect I will be back with some questions.

Iā€™ve read this a little more thoroughly now and see that I donā€™t need my own C compiler and IDE. Thatā€™s a relief! I have a few things on my plate now, like implementing multi-threading in Prolog, making the next steps in rnafold.pl and trying to learn a few things about RNA. I will be getting back to this and I am sure it will be a great help in the near future.

I have placed a document on the google drive (link in original post). ā€œRnafold_SDD.docxā€ is a Software Design Description which presents one way in which Rnafold could be integrated with the eternagame software. It is not a very detailed description at this point, but is intended as a starting point.

There are four new files on the google drive. One is just the latest version of rnafold.pl. The other three are output files for gs1 (gene synthesizer level 1) with wide medium and narrow focus. Note the difference in file size. The run times were similar (36 to 41 min.). I think this is because, although the narrower focus produces fewer results, it does extra crunching to weed out the less likely ones.

Wanted to put a plug here for an open source python package that Iā€™ve developed and use for analyzing RNA structure: http://www.github.com/daslab/arnie . It includes instructions for downloading packages and then calling them all through a python interface.

1 Like

This looks extremely interesting and will take me a while to digest. I am wondering if you have already done what I am working on? Iā€™m taking an expert system, rather than neural net approach to finding likely solutions to eternagame puzzles. I have always thought that the best thing was to combine the two, but that is not easily done.

I can see that you have done a lot along the lines of ā€œgive me a sequence and I will predict the structure.ā€ I am not yet clear on whether you have done ā€œgive me a structure and I will predict one or more sequences that will produce it.ā€

Oops!
I missed the ā€œno c-g bondā€ restriction in gs1. I have replaced the wide and narrow focus results and eliminated the medium (ran out of rules). The file sizes have decreased significantly. I also replaced the current version of rnafold.pl.

welcome wayment to eterna forums!

[https://riptutorial.com/ebook/prolog]
Hereā€™s another link to what I think is a good Prolog tutorial.

github says to install Python 2.7.12. But Python wants to download version 3.8.5. Will that work?