Long before my biomedical informatics studies at Stanford, I learned to recognize the difference between what is computable and what is not. In my past three startups this knowledge has been the key to engineering success.
Recently I made a trip to visit my alma mater, the Rochester Institute of Technology, where I earned my computer science degrees. For those not familiar with RIT, it has been ranked as one the top 10 of universities in the Northeastern United States. Recently, Linked-In ranked RIT in the top 25 for software development programs in the nation, and as #13 for software developers specifically for startups.
While on campus, I reconnected with my thesis advisor, Prof. Stanisław P. Radziszowski. Dr. Radziszowski is a highly-respected computer scientist and mathematician, and is best known for his work in Ramsey theory. He has published a number of works in Ramsey theory, block designs, number theory, and computational complexity. Since 1984, Dr. Radziszowski has mentored and trained thousands of students at RIT and I was not sure if would remember me and my work. I was pleasantly surprised that he not only remembered me, but had my thesis readily available on his bookshelf. He told me that he was very excited about my work in combinatorial mathematics, and in the years since has shown it to students to inspire them to work on similar problems in computability theory.
One of the most important things I learned while a student of Dr. Radziszowski was how to skirt the line between what is computable and what is not. It is easy to imagine scenarios that seem like they should be easily solved with a computer, however there are many problems that turn out to be unsolvable. For example, let’s say you want to build a chemical storage facility. To minimize construction costs, you want to find the minimum number of buildings to store chemicals in, but you have to make sure no chemicals that react with each other are in the same building. Sounds simple, right? Well, turns out that there is no known solution that can be calculated optimally without nearly infinite computational resources.
My thesis at RIT was about the mathematical problem described by the chemical storage problem above. My method struck a balance between computability and complexity, right in the fuzzy center of what can be computed and what cannot. It involved developing a new heuristic which approximated the optimal solution. While my heuristic was not guaranteed to always produce the best solution, it produced a result that was theoretically very close to optimal, and better than other known approximation methods at the time.
Heuristics like these are the way that computer scientists model extremely complex systems. Today, there is no way to model every small detail that make up the complex interactions between organisms, disease, and drug treatments. There are too many variables; and to compute every possibility of interaction is impossible. We have perfected a computational technique at twoXAR that represents complex biological reactions in such a way that accurately reflects reality, but is simple enough to perform fast computation on. This technique is what enables us to go from biological data sets to accurate drug-disease prediction within minutes.
It was a pleasure to circle back and talk about twoXAR with the mathematician who inspired in me the principals behind our methodology all those years ago. Students at RIT should be honored to learn from such a talented individual, and I know the next generation of great data scientists are attending Dr. Radziszowski’s lectures today.