How can we handle Big amounts of data

Player Projects are looking really exciting but the amount of data coming out of them is going to be difficult to assimilate.

So here are some suggestions (that came out of a chat) for ways to let players (and devs) look at the data coming out of them.

  1. Search on shape notation with wildcards - so if I want to find all structures with shape notation ((…(…)*)) or *((…))* or similar.

  2. Eli also mentioned searching for base sequences (that search more or less exists)

  3. Edward_Lane: Search on ‘loop energy’ and overall energy, and dotplot density, and meltplot curve shape /steepness ?
    Eli Fisker: Yep and I want to combine things [2:38 PM]

Edward_Lane: So ‘advanced search’ - find me all sequences with at least one loop that has energy exactly equal to “-3”, a total energy of “-30 or more”, contains the shape ((…(…))), contain at least 60% Adenine, and the dot plot error is low (density of grey not in target area 90%)

Eli Fisker: Yep, I love the idea of advanced search [2:40 PM]
Eli Fisker: But I’m mostly interested in energy at specific spots [2:39 PM]
Eli Fisker: But overall energy would be great to search for too [2:39 PM]

  1. Eli Fisker: I hadn’t thought on using meltplot, but I could imagine that could be helpfull too, to be able to search for a meltplot curve on a certain degree in a certain square [2:40 PM]
    Edward_Lane: or ‘flat at start’ for x boxes [2:41 PM]
    Eli Fisker: Exactly [2:41 PM]

  2. Edward_Lane: oh and obviously the option to have all the search results appear in a sortable table with A->Z or Z-A “ordered by” options for each of the criteria [2:42 PM]

  3. Another option that might let players flag up interesting results would be to let people
    comment and “vote as interesting AFTER synthesis results” - giving another searchable value (find results where X people think this result was interesting) - though that might just mean some results get overlooked?

  4. When you then get a particular set of designs - you also want the option to show all other solutions to the same lab(s).

  5. You might want to consider also including a search for ‘similar labs’ where the ‘difference in shape notation’ is less than X puzzle builder button clicks (I think that describes adding/subtracting individual/pairs of bases pretty much anywhere).

There are many more options for things that you might search for - but most of those searches are already used and should be continued GC percentage, melting point, etc

Maybe we should also be able to search based on how well the design synthesized – if, say, we wanted to see all designs that scored better than 90%, or worse than 80%, etc. This could be combined with other searches to narrow the results.

I’d kind of assumed that the synthesis score was in the search listings, but I’ve spotted another criteria I’d like to search on

whether each particular bot solve the puzzles or not (and if more bots are created whether they can solve it too)

Oh, you’re right. We can already search that way. Hehe, nevermind! I guess I thought we were starting over from scratch.

another search criteria that might be of interest

percentage of shape notation that matches the synthesised shape results (the blue and yellow plot) - and then also ‘length of longest continous section of matching values’

which is a slight variation on the synthesis results - perhaps one section of an otherwise badly folded design synthesised perfectly - so perhaps that can be useful somewhere.