I made data art with giant heads
Last week was the 2024 GC Data Conference and there's a lot of think about
I facilitated "Paint by Numbers: exploring creativity with data" (link to PPT, apologies if the formatting did not survive being imported to Google Slides) at the GC Data Conference 2024 last week. I usually describe it as a workshop where we look at and make art using data, but it’s actually secretly a call to action to make the data field more inclusive.
The pieces I made using the notes I collected at the conference reflect that. The talks that stuck with me the most were done by speakers actively fighting algorithmic bias or challenging the use of statistical averages as the basis for models at all since it is inherently marginalizing, or working on indigenous AI models built by indigenous people.
Also the idea of giant floating heads came to me like a vision and really made me laugh so I had to do it.
It was very validating to hear Timnit Gebru say to a public service audience what my partner and I have been shouting into the void about (although to be fair so have many others, see: Mystery AI Hype Theatre 3000) for months now: AI is a marketing term, it’s not really meaningful and assigns way too much agency and capacity to what currently exists.
That’s the other message I hope people took from the data art workshop: it’s easy to get caught up in the novelty, but for us to fully utilize the potential of any AI tool… we need to get much better at the data. So I hope it sparks some inspiration for folks to start interact with data from wherever they are, with whatever tools and skills they have, and in slow, fun, beautiful, playful, or personally meaningful ways.
Things I’ve been reading and thinking about this week
Bilingualism by Design
I’ve been interested in the work of other officially bilingual countries for a while, particularly where the political context (colonial legacy) means that one language is treated as secondary and “just” a legal requirement. Something I worry about a lot as someone who is still learning French, what happens to the usability and quality of my content in the language I can’t develop in. And how do we do better while working with processes where we need to use official translation services who are not subject matter experts then depend on one or two native speakers to double checked the final documents?
I can’t exactly remember how I ended up on the Welsh government content design side of LinkedIn, but it’s been incredibly enriching.
One of the first people I followed and continue to learn a lot from is Nia Campbell. She is a Senior Content Designer at Content Design London and native Welsh speaker who regularly works with both the Welsh and the central UK governments. [note: as I went to find the links above, I realized the reason I started seeing Nia’s work was through her colleague, Jack Garfinkel, from whom I’ve learned a ton about accessible design].
I had the pleasure of having a conversation with her earlier this week to learn more about each others work and share some of the pain points and solutions of bilingual content development. She also organizes meet ups for folks working on Welsh content design, and she was generous to extend the invite to me for future sessions. It is very exciting to be able to share and find commonality with folks even across the world.
Approaches to bilingual design that they are experimenting with:
trio writing (bring a SME, a content designer, and translator together to develop the content simultaneously in both languages): may be the ideal but unsustainable as long as linguistic services are separate entities that are always working at max capacity with the requests for translation.
do user testing in the Welsh language content. As opposed to the typical process of testing the English content, integrating changes, then sending the final version to translation. The challenge with this approach is ensuring that any usability issues found in the Welsh version is reflected in the English. This part requires some flexibility in the process, as English content is sometimes the way it is because of tons of comments from policy, legal, etc.
Luckily, this community have been quite open with sharing their approaches and outcomes so there’s quite a few write ups and talks. I’m sharing only a selection. The last link below is a blog post I read this week which included the insight that Welsh speakers like to have an easy toggle to allow them to flip between the English and Welsh to confirm understanding, accuracy, and unfamiliar jargon. Something I wish existed on Canada.ca.
Rethinking translation: can we design content bilingually? - Content Design London
Adeiladu Gwasanaethau Dwyieithog / Building Bilingual Services (10/11/2022) (youtube.com)
Our approach to Welsh language content and testing - Justice Digital (blog.gov.uk)
By coincidence, one of my favourite speakers from the data conference was Michael Running Wolf. He is currently working on Indigenous language revitalization using AI. Something he said that stuck with me is that building language models using English as a default limits innovation. The underlying structure and assumptions are never questioned and there’s enough data in English that you can fudge your way to a decent statistically probable sentence. By comparison, companies that need to address the unique linguistic structures of their language are pushing the boundaries of the field. The specific example cited was Baidu’s General Language Understanding Evaluation (GLUE).
This article highlights work on Indigenous language AI which required building from scratch a voice recognition system that works with the verb-centred structure for Indigenous languages like Kwak'wala: How AI and immersive technology are being used to revitalize Indigenous languages | CBC News
The talk on Indigenous Knowledge & AI, including Dr. Suzanne Kite’s points about understanding the mechanisms of knowledge creation in context, has kicked off a bit of an epistemology rabbit hole. This is getting too long so I’ll save my love letter to librarians and archivists for next week.
Gatekeeping in the data sciences
I could easily rant about this for hours, but to keep it short, one of the reasons I work on data literacy now is because of the A-holes I’ve run across in the data field. Given the potential for harm, we should make sure that people are properly educated on (capital D) Data - how to manage it, how to collect/use it ethically, how to design research, etc. What drives me absolutely bonkers is what I usually call the “tech bro energy” that some folks bring to data work. Wherein they want everyone to know they are an expert by being patronizing, assuming they know more than everyone, and making people feel stupid or incompetent.
One of the worst cases I worked with was someone who turned out to have built an unnecessarily complex tool that no one else could be trained on (I know because I spent weeks untangling and rebuilding it), contained some specific hard coded rules that only he knew about, and held the team hostage for promotions/threats to leave when things didn’t go his way. It resulted in a constant churn and low morale for the team.
Data science is not magic. It does take time and practice to do it well. But it is doable. I like this blog post by someone who thought they were too late to start their data journey and what changed their mind: Are You Too Late to Start Your Data Science Journey? | by Soner Yıldırım | Towards Data Science
Upcoming events that look interesting
R-Ladies Ottawa is having an in-person networking event on Friday, March 8th (International Women's Day) at Beyond the Pale. The event is free to attend and you can RSVP here, by March 1st.
Shared Values Solutions are hosting “Data as Medicine: Indigenous Health through Protocol and Ceremony” on March 6, 2024, 2PM EST, on MS Teams. Registration link.
Extra Stuff
For some more fun placeholder text than Ipsum Lorem:
Some data art with reflections in the form of knitting: A Knitted Reflection — Canadian Art Therapy Association
Links that I’ll try to remember to paste in every week if I remember
Connect with over 4,000 data people to share resources and job opportunities in the GC Data - Informal/Unofficial Facebook Group
If these weekly posts are not enough, here are almost 800 links to fill your heart. This is a data resource repository I maintain to keep track of all the things I looked at or relevant to doing data in government. The curation is skewed toward things I am interested in or am actively working on.