What do mice chasing crickets, particle accelerators, solid sponges for natural gas storage, and toddlers with cameras mounted on their heads have in common? All were the subjects of short talks at yesterday afternoon’s “Informal Symposium” on Machine Learning in the Sciences at the University of Oregon, that Teddy Hay, Gabriel Barello, and I co-organized. The main goal was to foster connections between groups in the sciences that are using machine learning techniques (broadly defined), and thereby help us all share methods, problems, and solutions. We had hoped 20 or 30 people would be interested; the actual turnout was over 80!
Everyone I’ve talked to seems to agree that the symposium went very well. While it’s still fresh in my mind, I’ll jot down some notes. The first part will be on the format of the symposium, which might be useful to people elsewhere planning similar things. The second part will note some of the outputs from the discussions, which might be useful locally.
Topics and Structure
The diversity of topics and attendees was remarkable, with graduate students, postdocs, and faculty spanning Physics, Neuroscience, Psychology, Earth Sciences, Chemistry, Ecology & Evolution, Linguistics, and more. We capped the duration of each of the 16 talks at 6 minutes and allowed no more than one per research group, to maximize the exposure to a variety of local colleagues’ work. Coexisting with this variety were similarities of subject (such as image classification), methods and software tools, and issues such as the validation of simulated data.
Intentionally, we focused on (i.e. I sent emails to) groups in the natural sciences, but we welcomed others who learned about the event and wanted to participate. We had a wonderful talk from Linguistics (classified as Humanities at UO), and I think another fascinating symposium could be solely devoted to Humanities and Social Sciences; exhaustively combining this and Natural Sciences, though, would be too much for a short symposium. It was great to have attendees from Computer Science, but I think our strategy of not actively seeking this was a good one — it would be easy to be overwhelmed by CS, and many of us believe strongly that our goal is not the development of new algorithms and methods, but the effective deployment of tools that already exist to help us do science.
Of course, it remains to be seen whether useful connections develop between symposium attendees — more on that below — but everyonepresent met people they didn’t previously know (I asked), were probably pleasantly surprised by the richness of expertise around campus, and hopefully had an enjoyable afternoon, so I’ll count the symposium as a success.
Discussions
We devoted about an hour of the three-and-a-half hour symposium to discussions, with some specific suggested topics:
- What software / tools are people using?
- What training / resources / etc. would people like to see at UO [the University of Oregon]?
- Are there future collaborative projects / grants you’d like to discuss?
The second topic especially spurred a lot of conversation, which can be grouped into a few categories:
Workshops. Many people felt that workshops on various methods and techniques would be very useful. One challenge is the large variety of prior training and skills researchers have, especially new graduate students. (Even within the same department, some entering graduate students are experienced programmers; some have done zero programming. We’d like to train both of these groups.) This issue comes up in many contexts, and could perhaps be dealt with by having many different short workshops with different entry points, or sequential workshops that could be joined at various points. There were comments that workshops should involve hands-on exercises, including ones using one’s own data; that they could involve partnerships between the sciences and computer science; and that it would be valuable bring in external experts.
Information about resources. Various departments run various courses on data analysis, programming, etc., and it was clear that many (all?) of us only knew of the existence of a small subset of these resources. Addressing this would be useful. Similarly, a list and description of the tools and packages that people here are using, with brief notes on what they’re good for, would be useful.
Drop-in help. A “consultant” for machine learning / analysis questions, to drop-in on and get advice, would be valuable. Thanks to new initiatives at UO, this looks likely to exist!
Technical issues. A few people pointed out that data transfer is often a limiting factor, especially for using shared resources; data management and storage can be challenging;
Fostering connections. Some suggestions for this included having a yearly machine learning symposium; informal conversational meetings of people working on projects involving machine learning; an email listserv (already happening!); a seminar series; and campus-wide competitions.
Other things. There were broad questions of how to validate simulated data, what machine learning is learning, etc.
Of course, many of these ideas and questions about training, resources, and such things were not resolved — doing that will take thought, money, people, and time — but I think we’ve made progress on them and, most importantly, we’ve fostered connections around campus!
Thanks to everyone who attended, and to UO and the Materials Science Institute for providing space and the all-important coffee & cookies!
Today’s illustration
A photo of a painting in-progress, of a chemical synapse. I made watercolor droplets, which look neat when wet and which dry into intense disks. The painting is finished; I’ll probably show it in a future post. An inset:
— Raghuveer Parthasarathy. September 21, 2018