Scraping iMessage and Messenger Messages and Displaying with Vue Frontend

Credit: she founded the project and provided the first version of the scraper.

A while ago my partner in the organization started message-analyzer because she thought it would be interesting to analyze the message data between us. She managed to scrape text messages out of both iMessage and Messenger (the two chat softwares that we use), put them together, built something that could decide which one of us a messaging is coming from. I believe the highest success rate she got to was 86%.

I was looking around in the project after she got most of it done and noticed this file called app.py that runs a Flask application and serves the text messages on a web server. Since I’m pretty much a frontend developer now (no), I came up with the idea of displaying all of our messages on a web page, hopefully merging contents on both Apple and Facebook platforms.

iMessage

I started with iMessage. It wasn’t too hard to simply take the output of the function that she wrote and serve it over the api endpoint. For the frontend I decided to try out Vue.

It wasn’t long before I got to the following:

The main component simply requests all messages and pass each data to a Message component. I added pagination for some convenience.

Message component looks like this:

It just displays the message content. If hovered, the delivered time is shown as a tooltip.

It all looked good, but how about attachments? There were hundreds of interesting images, stickers and files that we sent each other. It would not be as interesting if those were lost for the web page.

To show attachments, I dug deeper into how Apple stores messages.

Inspired by my partner, Apple stores messages in a sqlite database located in ~/Library/Messages/chat.db, so I took the liberty of looking at the schema.

Three tables caught my attention: attachment, message_attachment_join, and message.

attachment:
    filename
message:
    ROWID
message_attachment_join
    message_id
    attachment_id

The message_id matches with the ROWID on the message table. filename is actually a path to the attachment file on the local machine. With these information at hand, I revised the sqlite query to

After the messages and attachments are selected, I served the attachments over the api endpoint ‘/attachments’, and voila pictures on the page!

I later also displayed reactions to messages but I’d like to get to scraping Messenger soon.

Messenger

Scraping Messenger is a little more tricky: my partner did it by scrolling up all the way to the top, saving the html file and extracting information from there. However, since the data is parsed once already by the Messenger frontend, it was a little difficult to get the dates and attachments as well as the messages.

I went into Chrome devtools and saw that the juicy request was to the url facebook.com/graphqlbatch. Ah so they use their own product. What’s frustrating is that each request at most retrieves ~200 messages, and Chrome doesn’t let me copy multiple request responses at a time.

I tried to reverse engineer how the requests are formatted, but was stuck at figuring out how the message count offset was sent. I came to the idea of writing a Chrome extension to capture the web requests.

The only API that allows you access to response bodies is devtools. Creating an extension is also easy – just need to have a manifest.json file that specifies the extension and some js scripts to be run by the browser, so I did this:

and used pyauthogui from my partner’s code to automatically scroll up like an idiot. I was able to get all messages in the devtools window of the devtools window (no typo). The repository is here.

All that was left was parsing the data retrieved and making sure both message sources end up having the same format when returned by the Flask server. Messenger had more attachment types and multiple attachments so it took me longer.

Due to privacy reasons, I can’t do a demo here :/ well mostly it’s just that I’m too lazy to put up a page with fake message data.

For future features I plan to do searching, improve pagination, style the Messenger system messages (“you waved at each other”), and make the UI prettier and easier to use.

Daily bothering with launchd and AppleScript 

Credit: All of the following “someone” is this one.

This morning I noticed this repo was forked into my GitHub organization. I’m still not sure what the original intent was but I interpreted as a permission/offer to contribute. Since the repo’s name is “simplifyLifeScripts”, I spent some time pondering upon what kind of scripts would simplify lives, or more specifically, my life. I then came up with this brilliant idea of automating iMessage sending so that my Mac can send someone this picture of Violet Evergarden on a daily basis:

Violet

In the past I had to do this manually by dragging the picture into the small iMessage text box, which was simply too painful to do (I blame Apple). How cool and fulfilling would it be to sit back and let the Apple’s product annoy an Apple employee!

After some GOOGLing I came across this snippet of AppleScript that lets you send an iMessage with your account:

on run {targetBuddyPhone, targetMessage}
    tell application "Messages"
        set targetService to 1st service whose service type = iMessage
        set targetBuddy to buddy targetBuddyPhone of targetService
        send targetMessage to targetBuddy
    end tell
end run

Basically it takes iMessage service from the system and tell it to send the message to a person given a phone number.

Since I also have to send an image as attachment, I added to this piece so it became:

on run {targetBuddyPhone, targetMessage, targetFile}
    tell application "Messages"
        set targetService to 1st service whose service type = iMessage
        set targetBuddy to buddy targetBuddyPhone of targetService

        set filenameLength to the length of targetFile
        if filenameLength > 0 then
            set attachment1 to (targetFile as POSIX file)
            send attachment1 to targetBuddy
        end if

        set messageLength to the length of targetMessage
        if messageLength > 0 then
            send targetMessage to targetBuddy
        end if
    end tell
end run

It now takes one more parameter that’s the file name. The script converts the file name to a POSIX file and send as attachment. I also added two simple checks so that I can send text and/or file.

The next step would be to automate the process. Just when I was ready to Google one more time someone pointed me to Apple’s launchd, which is similar to unix’s cron. launchd lets you daemonize pretty much any process. One needs to compose a plist (a special form of XML) file and put it under /Library/LaunchDaemons/, then the daemon would start as one of the system start up items.

Following the official guide, I made the following plist file:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>com.billyu.botherlucy</string>
    <key>ProgramArguments</key>
    <array>
        <string>osascript</string>
          <string>/Users/billyu/dev/simplifyLifeScripts/sendMessage.applescript</string>
        <string>9999999999</string>
        <string>Daily Lucy appreciation :p</string>
        <string>/Users/billyu/dev/simplifyLifeScripts/assets/violet.png</string>
    </array>
    <key>StartCalendarInterval</key>
    <dict>
        <key>Hour</key>
        <integer>0</integer>
    </dict>
</dict>
</plist>

The ProgramArguments key is mapped to an array of arguments used to execute the process wrapped in the daemon. In my case, I just run osascript to execute the AppleScript at the absolute path, with the phone number, text message, and the image absolute path as parameters. The phone number is obviously censored.

The other key, StartCalendarInterval, is a handy way to run the job periodically. Any missing key will be filled with “*” wildcard. In this case, the process would be run every day at 00:00. I later changed it to 22:00 after realizing my computer might be shut down at midnight. Can’t miss the bother window.

To avoid restarting my laptop, after copying the file to the launchd directory I did sudo launchctl load {plist file path} so the daemon would start right away.

I did some testing with sending the message every minute and it worked perfectly. It’s worth noting that this is one of the few things that just worked the first try.

Excited for 10pm tonight! Although someone else might not be.