Here is a link to the Prolog manual:
Prolog Manual
Once you get there you can also go to the download page and get the free(!) compiler so you can try it yourself. I find the manual a bit opaque, but you youngsters who were raised on Unix may find it quite readable.
There are a couple of good books on the subject: āProgramming in Prologā by Clocksin & Mellish and āThe Art of Prologā by Sterling and Shapiro.
Hereās a very quick primer for those who havenāt gotten the emails I sent you:
:- means āifā.
. means āorā.
, means āandā, except when itās just an argument delimiter within () or [].
Variables start with an uppercase letter.
constants start with a lowercase letter or a number.
Square brackets enclose a list.
[] is the empty list.
[H|T] is a list that starts with the head, H and ends with the tail T.
Subroutines are called predicates.
A predicate may have multiple clauses, separated by commas (again meaning āandā).
Your second question is one that also occupies my mind. The easiest thing to do is to generate all possible combinations of bases that are not locked in the puzzle description, thus producing an astronomical number of possible sequences. To pair that down I apply rules. The first is to specify that all pairs which are bonded in the final structure consist of bond pairs (u-a, c-g and u-g) in the sequence. In some cases it is necessary to add an empty cell (e) to make things line up. The e is then deleted before output.
Then there are more heuristic limits. These include requiring strong (c,g) bonds when closing loops and requiring that (u,a) bonds alternate along the length of the RNA when possible. I will be looking for a lot more of these in the future.
So, the idea is to provide output that is more than randomly likely to succeed, is a much smaller set than all that could be imagined, but still leaves room for those āoutside the boxā solutions, which, I believe are the main objective of the Open Vaccine project. I have added a Focus variable, which may be set to w(wide), m(medium), or n(narrow) to allow for the application of more or fewer rules during the run.
The sets for the first two puzzles are different. Of course, finding a solution to either of these is simple. However, if we take the problem at itās word, then most solutions will have more than five of the required base. You canāt actually get to those from within the game. I think there are about 12K solutions for each one and I think that the output sets enumerate all of them. I canāt say that i have either counted or checked them all.
A few quick words on the google drive.
The source file is Rnafold.pl
Text (.txt) files are the raw output from the program.
File 1 through 11 are for the first eleven puzzles.
gs1 is for the next puzzle (#1 in gene synthesizer).
A ā_wā, " _m" or " _n" in the file name indicates it was done with a specific focus.
Files with the extension .rtf have been edited with comments and better readability when printed.
This is getting to be a pretty long reply, so I think I will stop now.