Allgemein

Learning something new: React Native

Over the end-of-year slowdown and the holidays, I’ve started to learn something new: React Native (and TypeScript along with it). It’s refreshing to approach a technology I haven’t actively used with a beginner’s mindset. Plus, it’s fun to build stuff.

A new tech stack for me: React Native and TypeScript

React Native is a framework to build mobile apps for both iOS and Android using the same codebase (which is either JavaScript or TypeScript).

You can do much more with React Native, but this is what it’s mostly used for.

Why React Native?

First, professional relevance: I work as an AI and Machine Learning Engineer, so I usually work in the Python ecosystem. However, ML software doesn’t live in isolation and we are often building web or mobile applications, either as internal tools or for product integration of the machine learning systems. To be able to build web and mobile applications, better knowledge of React and the ecosystem makes a lot of sense to me. In fact, my whole team has recently decided to up-skill in this direction.

Second, personal interest: Since I stopped working as a web developer in 2017, I haven’t really followed the changes in the web and JS space. I’ve remained curious about web technology and have always wanted to be able to build mobile apps for my personal use and potential side projects. React Native offers both, plus a lot of the knowledge will transfer easily to vanilla React for the web.

How I am learning

I like reading traditional paper books when learning something new because I can focus better when I look at printed paper rather than a digital screen.

Book 1: Professional React Native by Alexander Kuttig. A compact overview of the important elements of React Native projects and a collection of best practices. The book is not comprehensive in listing the available API methods, but I like this style: It’s a fast-paced guide that I can use to start building my own projects. The book has many pointers on important packages. There are some mistakes in the code listings and the code formatting is sometimes broken, so the whole thing feels a little rushed. Still, I’d recommend it if you have previous programming experience.
Book 2: Learning TypeScript by Josh Goldberg. A compact, but detailed look at the TypeScript language. I have only covered the basics of the language to get me started on my own projects, but I will continue reading this book because I want to make use of the full power of TypeScript in my projects. It’s very well explained and has clearly gone through a better editing process than Book 1 (which is what I would expect from an O’Reilly publication). Clear recommendation.
Learning by doing: As I am working through these books (and googling anything I don’t know), I am building my first project, see below.

The two books I am currently reading to learn React Native and TypeScript.

My first project: A Mastodon client

Having looked at the Mastodon API in a previous (Python) project, I decided to build a Mastodon mobile app for my personal use – or rather my learning experience.

I have worked on the project for a few days now, and it is almost at MVP-level, meaning it provides some value to the user (i.e. to me).

My first project: A Mastodon client. I split the first feature (the Home timeline) in 3 steps to start with something simple and slowly build on top, as I am learning new concepts.

What I’ve implemented and learned so far:

Project setup of a React Native app

This took longer than expected because I needed to update Node and Ruby versions on my Mac. This reminded me of the frustration I felt as a web developer 5+ years ago when every few weeks the community moved to a new build tool and all dependencies had to remain compatible. It took me around 2 hours for the setup, but I’m happy I came out on the other side because since then the dev experience with React Native and hot-reloading of the app in the phone simulator has been pleasant.

Fetching the personal home timeline

I decided not to use any Mastodon API wrappers but to use the REST API directly. It helps me learn what’s actually going on. This is straightforward using fetch() and casting the result to a matching type definition in TypeScript. Reading the home timeline requires authentication. I haven’t built a UI-based login flow yet, but I am simply passing the auth token associated with my Mastodon account.

Display of the home timeline

This is the only real feature I’ve implemented, but it helped me to learn quite a bit:

Build and structure React components
Use React hooks
Styling of React Native views
How to render the HTML content of the posts as native views
How to implement pull-to-refresh and infinite scrolling of a list view

What is still missing

For a full-fledged Mastodon client, I’ve maybe implemented 2% and the remaining 98% is still missing. Even for an MVP “read-only” app, I am still missing some crucial pieces:

Login flow
Display attachments (images, videos, …)
Detail view of the toots with more details (replies, like count, …)

I need to learn a few more core concepts to be able to implement these features, most notably navigation of multiple views and storing data on the device.

My plan is to build out this MVP version to continue learning the core concepts.

Afterwards, I will probably look for another project idea, one that is uniquely “my project”.

Ambitious ideas for this project

If I do end up working on the Mastodon app longer term, there are some ideas that would be fun to implement. In particular, I’d love to bring some of my Data Science / ML experience over to a mobile app. How about these ideas:

Detect the language of posts and split your timeline into localized versions
Detect the sentiment of posts and let the app know if you want to filter out clickbaity posts today
Summarize today’s posts in a short text (possible GPT3/ChatGPT integration)
Cluster posts into topics (like “news”, “meme”, “personal” or “cat content”) so that you can decide if you’re in the mood to explore or simply want to focus on what’s relevant today
Include tools to explore your Mastodon instance or the whole fediverse: Find accounts you would like, and find accounts that are popular outside your own circles. Some inspiration is in my previous post on exploring the Fediverse.

Follow along

If you want to follow along, you can find my current project progress on Github. Remember that this isn’t meant as an actual Mastodon client, but as an educational exercise for myself. Use at your own risk.

Github for the project source: https://github.com/floriandotpy/rn-mastodon

Exploring the Fediverse

Like many, I have been looking for a new digital community in the past few weeks (the old one is on fire) and have found a place on Mastodon.

You can find and follow me at https://sigmoid.social/@florian

I’ve picked the Mastodon instance sigmoid.social, an AI-related instance that is only 3 months old but already has close to 7000 users.

Machines talking to each other

Each Mastodon instance has a public API so it’s straightforward to fetch some basic statistics even without any authentication. I wrote some simple Python scripts to fetch basic info about my home instance.

You can find my scripts on Github if you’re interested in doing something similar (very rough code): https://github.com/floriandotpy/mastodon-stats

Who else is on my home instance?

I wondered: Who are the other users on sigmoid.social? To gain an overview, I fetched the profiles of all user accounts that are discoverable (which at the time of writing means 1300 accounts out of 6700).

Most profiles have a personal description text, typically this is a short bio. I plotted these as an old-fashioned word cloud.

The insight isn’t that surprising: The place is swarming with ML researchers and research scientists, both from universities and commercial research labs.

Who is present on sigmoid.social? Getting an overview from this word cloud generated from user profile bios.

A stroll through the neighborhood

You don’t want to have an account surrounded by AI folk? No problem, there are more than 12,000 instances to choose from (according to a recent number I found). And they can all talk to each other.

I wanted to see how connected the instance sigmoid.social is and plotted its neighborhood.

This is the method I used to generate the neighborhood graph:

Fetch the 1000 most recent posts present on the instance (which can originate from any other Mastodon instance).
Identify all instances that occur among these posts, and fetch their respective recent posts.
With all these posts of a few hundred instances, create a graph: Each instance becomes a node. Two nodes are connected by an edge if at least five of the recent posts connect the two instances.

My method is naive, but it works sufficiently well to create a simple undirected graph.

The graph yields another unsurprising insight: All roads lead to mastodon.social, the largest and most well-known instance (as far as I know).

Neighboring instances (based on their most recent 1000 toots).

Join us on Mastodon?

I may or may not become more active as a poster myself. In any case, feel free to come over and say Hi: https://sigmoid.social/@florian

To see how these figures were created, find the scripts on Github (very rough code): https://github.com/floriandotpy/mastodon-stats

New domain

Welcome to casualcoding.com, the new location of my blog.

If you’re one of the few who still use RSS feeds, make sure to update your feed URL to https://casualcoding.com/feed. The old URLs will redirect automatically, but better safe than sorry.

Book Review: Machine Learning Design Patterns

Machine Learning Design Patterns: A good overview and reference book for machine learning topics in production environments.

If you are anything like me, “machine learning” to you means working with algorithms that adapt their function from data. And while this is true, it’s not the complete story when actually working in the field of machine learning.

Yes, picking the right algorithm and creating the appropriate model is important. But data cleaning, optimizing for production, and setting up scalable infrastructure is just as much part of the day-to-day work on the job.

Along comes Machine Learning Design Patterns, a book that looks at common challenges in practical machine learning. It leaves out model architectures on purpose and instead promises to collect best practices to move machine learning to production.

Each chapter: A (machine learning) design pattern

The book’s title plays on the famous 1994 book by the Gang of Four that popularised the concept of design patterns in software engineering. It comes as no surprise that in Machine Learning Design Patterns, the authors attempt something very similar and identify reoccurring problems from their field.

Each chapter is structured the same way:

What is a common problem in machine learning?
What is the recommended solution?
What are trade-offs and alternatives?

What’s in the book

Written by three engineers from the Google Cloud AI team, the book covers a breadth of topics:

Data Representation
Problem Representation
Model Training
Resilient Serving
Reproducibility
Responsible AI

The early chapters cover topics frequently occurring in the “data science” section of machine learning: What is an embedding, how to work with imbalanced datasets, how to create proper checkpoints during training.

In the later parts, the authors focus more and more on inference and challenges of automation, repeatability and scalability. I now have an idea what a feature store is and how to bridge data schemas when mixing old and new data sources.

A glance at the table of contents. Each chapter follows (more or less) the same structure.

Examples: Python and SQL code

The book is full of examples and they use different technologies: Tensorflow examples in Python, a BigQuery listing, a Google cloud SDK API call.

From the preference of technologies, you notice that the book has been written by Googlers. This doesn’t matter in my mind, because the concepts are always explained clearly, so that porting to other platforms or products should be straight forward.

What I liked

The book covers a wide range of topics and helped extend my knowledge of machine learning to areas I am not an expert in: Resilient Serving, Reproducibility and MLOps in general.

The structure of design patterns lends itself to keep this book on the shelf for future reference. Chapters have a clear motivation and are written to the point, so that I can see myself looking up a design pattern in the future.

Aside from technical topics, the authors also include three chapters about responsible AI and a (brilliant) section about the ML Life Cycle and the AI Readiness of organisations.

What I didn’t like

I have a few complaints, though.

In places it becomes clear that the idea of extracting design patterns from machine learning approaches works well for some topics, but becomes a bit of a stretch for others. I personally didn’t mind this too much, but it’s not as elegant the title of the book suggests.

What I did mind was the fact that this first edition is quite riddled with errors: From figures containing incorrect numbers that don’t align with the text (just annoying) to an explanation of convolution layers that confused convolution with pooling, I think (potentially misleading).

And a final nitpick: The printed copy I ordered was a a monochrome version with low contrast. Many figures were completely indecipherable. A bit disappointing for an O’Reilly book upwards of 40€.

Thin paper, suboptimal graphics. The printed version doesn’t feel very premium.

Conclusion: Great overview to bring ML to production

In conclusion, Machine Learning Design Patterns gives a great overview over common problems you encounter when designing, building and deploying machine learning algorithms.

It will offer valuable content for many in the industry: Data scientists who have never deployed a cloud pipeline, ops experts who are curious about “MLOps” and the product person who wants to understand the constraints and possibilities of modern machine learning development.

My favourite chapter was actually a non-technical one: How to move a team and a whole company from running first ML experiments to becoming an ML-first organisation. This idea ties a lot of the technical and human topics together and it is a topic that excites me personally.

I’ve enjoyed working through this book (together with my data science study group) and it will find a valued place in my bookshelf – to be referenced whenever I encounter one of the problems in the wild again and need a foundational perspective.

Machine Learning Design Patterns, the printed copy. For long-ish reads like this, I personally prefer the physical copy.

How I spy on my cats when I’m not at home

With a second cat moving in this week, I am even more curious than before about what happens at home when no human is around.

There is a range of products that offer pet monitoring. Given I have an unused Raspberry Pi and a spare webcam lying around, I decided to DIY the solution.

Building a pet cam using a Raspberry Pi, a webcam and tailscale

As it turns out, this is easy: Take a Raspberry Pi, connect a webcam and install motion – an open source tool that exposes the webcam stream over the local network.

To enable remote access, I used tailscale which creates a private network for all my devices, no matter where they are located physically.

This guide has great step by step instructions which take less than 30 minutes to complete: https://tailscale.com/kb/1076/dogcam/

One thing to note: Pick the right Raspberry Pi. I started off with the model B+ (from 2014) which is a little underpowered and even has issues running the current Raspberry Pi OS smoothly. Luckily, I also had a “Pi 3 Model B” sitting in a drawer which did the job just fine.

My final setup: A headless Raspberry Pi – no keyboard or display attached. Plug in webcam, ethernet and power, and the stream will start automatically.

Save snapshots when motion detected

The motion project comes with some handy features: Whenever motion is detected, it will save a snapshot frame and even short videos. These are stored on the Pi (under /var/lib/motion by default) and include the time of the event.

Even when not watching the webcam stream, these recordings allow a summary of what the furballs were up to while I’m gone.

Tailscale: Pretty cool

I hadn’t used it before, but tailscale really was perfect for this case: The Raspberry Pi is one device in my virtual network. The other two I’ve added are my laptop and my phone.

Anytime I want to check the webcam stream I simply open the browser and access the (virtual?) IP of the Pi.

On the iPhone, I added that URL to the home screen so the ominous toilet stream is always in reach.

Next level: Five eyes

The initial setup was easy. I now have one webcam that I can place anywhere (anywhere the ethernet cable reaches, that is). A set of these would be cool so that I could monitor all movement in the flat.

My ambitions to build the next NSA for cats are limited though, so I’ll probably stick to a single cam and point it at one key location. Right now it’s looking at the litter box and I’m working on a spreadsheet to plot the bowel movements of the little one. Uhm, yeah.

The new oil

Another idea for the next step: Collect the visual data over time and run some vision algorithms. How often do they eat? Do they really sleep 16 hours a day? Who spends more time in each room? All of them cool ideas (which I’ll never implement, let’s be honest).

This was a quick project. I can now check in on my cats when I’m out and about. A Saturday afternoon well spent.