Rank Images Site Built with React + Express + MongoDB

Not long ago I built this site that lets users rate images and provides a ranking. Yes, this looks similar to Zuckerberg’s notorious rank female faces site, but the project simply allows you to rank whatever images. The content is not my concern.

Backend

MongoDB was chosen as the database for storing user info, images and ratings. When my friends and I came up with the idea of this site, we also wanted to compare different aesthetic preferences among our friend circle, so we decided to pre-generate a queue of image pairs for all users to rate, and the progress of each user is stored as a field in the user model. The pairs are stored as a stand alone collection.

User model:

Queue model:

Rating and Image models are also created, but the structures are simple enough that I have no need to put them here. Rating model saves which user voted for which candidate in a Queue model; Image model just saves the image data and file type. A vote property is added to the Image model to simplify the sorting process.

The api endpoint design is also straightforward. There are three sub routers:

  1. image
    1. GET api for images
  2. rank
    1. GET for the ranking for all
  3. rate
    1. GET the next pair to rate with a user id
    2. POST the rating of the user

There is no user login/signup. Every user is “invited” after the user information is put directly into the database. They can put in their unique Object id on the page to access the rating page. The ranking result page is public to all.

Frontend

The frontend is also very simple: A React Router handles a home page, a rating page, and a ranking result page.

  • The home page takes the user’s unique id and redirect to the rating page:
    • this.props.history.push(`/rate/${this.state.userid}`);
  • The rating page extracts the unique id from the url and send a request to the server for the pair to rate:
    • nextEntryForUser(this.props.match.params.userid)
    • nextEntryForUser calls the api endpoint and pass the userid for the Queue object.
  • The rank page sends a GET request to the server and displays all the images in order.

The whole site is mobile responsive. I probably spent more time tuning the css than coding up and refactoring the components. -_-

I reused the Docker deployment mentioned in this article.

Video demo:

Takeaway/left to do

  1. data analysis on the ratings (our first incentive for building this)
  2. allow MongoDB to be connected remotely (this includes configuring the security settings so no one hacks us braindead-ly)
  3. find interesting and LEGAL images to rate on
  4. I really need to learn more about infrastructure stuff, which is my real job responsibility

The repository is here.

Image Uploading and Attaching for the Blog Project

Today I have been working on this feature that allows users to upload image as attachments and to include them as Markdown format.

Here’s a demo of the complete feature:

The first decision I made was how I should accept the upload. After several minutes of thinking with my pea brain, I decided to use multer to take file upload from clients. multer will put the file upload in a directory with a randomly generated name to avoid name duplications. It works well with express js in that it sets the ‘file’ property on req with useful properties such as filename, filesize and the path to the file.

Without much further thinking (which later proved to be a mistake), I thought it would be natural to simply serve the directory that has all the uploaded files.

So I wrote these:

API for upload

const express = require('express');
const router = new express.Router();
const fs = require('fs');
const multer = require('multer');
const upload = multer({ dest: 'uploads/' });
const auth = require('../auth');

const { ALLOWED_EXTENSIONS, MAX_SIZE } = require('../config');

router.post('/', [auth.admin, upload.single('upload')], (req, res, next) => {
  const filename = req.file.originalname;
  const path = req.file.path;

  const splitArr = filename.split('.');
  if (splitArr.length === 1 || !ALLOWED_EXTENSIONS.includes(splitArr.pop().toLowerCase())) {
    removeFile(path);
    return res.status(403).json({ message: 'Forbidden file extension' });
  }

  if (req.file.size > MAX_SIZE) {
    removeFile(path);
    return res.status(403).json({ message: `File exceeds maximum allowed size: ${MAX_SIZE / 1000000} MB` });
  }

  res.json({ path: req.file.path });
});

function removeFile(path) {
  fs.unlink(path, err => {
    if (err) console.log(err);
  });
}

module.exports = router;

serve directory:

app.use('/uploads', express.static('uploads'));

frontend’s upload method

onUpload() {
  if (!this.file) {
    this.props.displayMessage('You haven\'t selected any file yet!');
    return;
  }

  const data = new FormData();
  data.set('upload', this.file);
  instance.post('/uploads', data, {
    headers: { Authorization: 'Bearer ' + this.props.token },
  })
    .then(response => {
      const files = this.state.files.slice();
      files.push(response.data.path);
      this.setState({
        files,
      });
      console.log(files);
    })
    .catch(err => {
      this.props.displayMessage(err.response.data.message);
    });
}

This worked fine. However, when I deployed some other minor changes such as progress bar for upload and one click copy path button and hit refresh on my browser – the images were gone! I soon realized that it was because Docker created new containers because of the file changes, and the files on the original containers would be lost unless I do some backup or mount to the host.

That was when I decided to store all of the image files in MongoDB. In this way, the images are staying at the same place with the post contents, which makes backing up easy. It would also be easy to implement because I already had code for other schemas.

With some copy pasta

Schema for images:

const mongoose = require('mongoose');

const ImageShcema = new mongoose.Schema({
  data: Buffer,
  contentType: String,
}, { timestamps: true });

mongoose.model('Image', ImageShcema);

API handler for uploading and retrieving images:

const express = require('express');
const router = new express.Router();
const fs = require('fs');
const multer = require('multer');
const upload = multer({ dest: 'uploads/' });
const auth = require('../auth');
const mongoose = require('mongoose');
const Image = mongoose.model('Image');

const { ALLOWED_EXTENSIONS, MAX_SIZE } = require('../config');

router.get('/:id', (req, res, next) => {
  const id = req.params.id;
  Image.findById(id).exec()
    .then(image => {
      if (!image) {
        return res.sendStatus(404);
      }
      res.contentType(image.contentType);
      res.send(image.data);
    })
    .catch(err => {
      if (err.name === 'CastError') {
        return res.sendStatus(404);
      }
      next(err);
    });
});

router.post('/', [auth.admin, upload.single('upload')], (req, res, next) => {
  const filename = req.file.originalname;
  const path = req.file.path;

  const splitArr = filename.split('.');
  const extension = splitArr.pop().toLowerCase();
  if (!ALLOWED_EXTENSIONS.includes(extension)) {
    removeFile(path);
    return res.status(403).json({ message: 'Forbidden file extension' });
  }

  if (req.file.size > MAX_SIZE) {
    removeFile(path);
    return res.status(403).json({ message: `File exceeds maximum allowed size: ${MAX_SIZE / 1000000} MB` });
  }

  const image = new Image({
    data: fs.readFileSync(path),
    contentType: `image/${extension}`,
  });
  image.save()
    .then(saved => res.json({ path: `uploads/${saved._id}` }))
    .then(() => removeFile(path))
    .catch(next);
});

function removeFile(path) {
  fs.unlink(path, err => {
    if (err) console.log(err);
  });
}

module.exports = router;

I had to also catch ‘CastError’ in GET because the stupid mongoose throws when the param cannot be casted into ObjectId.

Basically what I do when the user uploads is storing the file to MongoDB, deleting the file in FS, and returning the ID of the file in the database. The ID can then be used for the GET api.

I’m also proud to say that this API endpoint is unit tested and (almost) fully covered. No I wish not to discuss the overall 0.2% coverage drop for the repo.

As I said above, I also added progress bar, feedback for copying path, error prompt for invalid files and some other UI features. The GitHub issue is now closed and I now just have to wait for the requester to come back online for my demo :).

Blog Post System using PHP

Today I would like to talk in a little bit more detail about my blog post system written in PHP.

The main page looks like this:

Screen Shot 2016-05-08 at 5.08.49 PM

First, the database structure:

Screen Shot 2016-05-08 at 5.28.27 PM

The structure is actually pretty straightforward: one table for user authentication, one for posts and one for comments. For user authentication, password hashed with md5 is stored in the database. When the users attempt to log in, their hashed input and the one in the database is compared, a traditional approach. For each post, two main pieces of information are topic and content. They form the body of a post. Author is stored simply as the username. Date is stored as a formatted string instead of UNIX timestamp because somehow I could not get that to work :(. For comment, its associated post is stored as articleId. When I present the comments of each article, I query the database for this articleId. This might be slower than other methods such as keeping references, but since I’m not storing a million blog posts, this works just fine.

Recently I finished paging and comment system. For paging, I first query the post table and get the total number of posts. Then according to articles_per_page variable set in config.php I query the table for more times with a LIMIT to present posts only for a specific page. Page index is given with a GET request. If there is not such information in $_GET, the default value is set as 1, obviously.

For now, comments can only be viewed after you click on each single article to see the details. At the bottom of the article, I query the comment table to look for the articleId. A helper method does this and returns the comments as an array of objects. I then simply use a loop to traverse the array and echo them out on the page.

Posting comment is a little bit different: the post request is handled by another php file which does not present anything. After storing the comment into the database, the script routes back to the earlier article. In the POST request, only content is passed. articleId is passed with the super global variable $_SESSION. I’m not sure if this is the best way, but it is surely easier to write than the curl method that I found online.

Several problems I encountered:

  1. For creating the post, not only do I need to verify the user is logged in when the page is presented, I also need to verify when the post request is received. Because softwares such as Postman can easily create a post request and flood the database.
  2. For frontend, I find that the CSS argument clear: both is amazingly useful. I used float a lot for my page design, so a lot of times I want to keep divs stable.
  3. Typo is a bitch, especially those inside double quotes. When coding on a server there is no nice IDE that reminds me there is a grammar mistake or a typo, so I really need to be careful about these. Sometimes one typo took me twenty minutes to debug.
  4. Security. When I gave my address to my friend to test it. He hacked the site with simple javascripts easily, which forced me to filter any input that the users are giving to the site. Now I blocked the word script completely, so evil people cannot alert me every time I get on the blog.

Things that I will be working on:

  1. Keep user input in session. In my project, when the user hit “comment” or “post” but they are not logged in, they are directed to the log in page and directed back but the input is lost. I definitely don’t want them to type all over again, so caching inputs is a good idea.
  2. Move log in/out to the main page as a small popup. Right now when the users click on login, they are directed to another page to put in their username and password. However, keeping them in the same page will save users’ attention loss.
  3. Adding styled text and images in post. Maybe I could add some buttons so the users can upload images for posts. I have to be careful though because some users such as my friend could upload bad things to my beloved server.

That’s pretty much it. I am just done with my finals yesterday and good news is I got a perfect score on the algorithm final! Yayyy. For this summer I plan to learn more about iOS and building projects with PHP, Swift and maybe a little Javascript. My friend told me modern websites are mainly written with Javascript so I want to learn about that.

Python server with Flask

With Flask and SQLAlchemy I talked about in the last post, I was able to implement a python server. It is able to show a list of restaurants in the main page. Clicking on restaurant names show their menu, and each menu item can be edited or deleted. Users can also create new menu items. The database uses SQLAlchemy, and url resolving and HTTP requests handling use Flask, a microframework that makes implementations of web servers very easy.

The link to the project on GitHub is here.

With flask, resolving url addresses is simply app.route('format'), and under this line declare the method for this address. Here is an example:

This means whenever the address fits the format host/restaurants/int, the method restaurantMenu will be executed. restaurant_id is a variable extracted from the url and passed in the method as a parameter. Whatever this method returns shows up on the webpage. In this case, the method is returning the result of render_templaterender_template is a method in Flask that looks for the template specified in the first parameter in /templates and pass in necessary variables. menu.html looks like this:Screen Shot 2016-01-13 at 9.40.55 AM

items is the variable passed in this template. Here html execution arguments are used.

url_for is another tool used in this project. It is also included in Flask. The first parameter is the handler method of the product url, and other parameters are variables passed.

The next step is for methods to handle both GET and POST HTTP methods. Here’s how to do it:

methods need to be put in @app.route. Of course, request class also needs to be imported from Flask. Handling different kinds of methods is virtually the same as handling one. It’s only one if statement away.

Finally, for the server to repond to requests with JSON object, objects in the database need to have a function that returns all of their information.

In object class of the database setup python file:

And then use jsonify class provided by Flask to return json:
return jsonify(MenuItem = menuItem.serialize)
The json can then be used by any kind of application!

As a next step of this project, I plan to implement a server that works with iOS applications. I may need to write a framework for Swift to talk with my server. This means a lot of work, but it has great potential!

I’ll just forget about the fact that I have a midterm 30 minutes later and one more tomorrow.

Using SQLAlchemy in python to control database

SQLAlchemy is an open-source python module that let developers easily create and manipulate databases with python. SQL language is not required because everything is already encapsulated inside the module. It also conveniently has a “session” feature that provides a function similar to git – all changes can be staged and committed altogether.

An introductory tutorial can be found here.

The idea is called Object Relational Programming(ORM). SQLAlchemy lets you use object oriented programming language such as python to operate on databases. For example, for a Student table in a database, it can have several attributes: id, name, and school. id and name can be represented by primitive types but school may be another object. In the actual database, there is a link from the student to the school that he or she belongs to. SQLAlchemy transforms this relationship into object references, which makes it easy to update tables with python code that can be run on a server.

Here are some basic code segments that can be used:

#import and basic setup
import os
import sys
from sqlalchemy import Column, ForeignKey, Integer, String
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship
from sqlalchemy import create_engine

Base = declarative_base()

The above piece of code import necessary modules. Column, ForeignKey etc are not default python classes, so they must be imported. A declarative_base is a class that contains a database table, a mapper which matches the table with python class, and a class object. An engine stores data in the local directory.

class Restaurant(Base):
    __tablename__ = 'restaurant'

    id = Column(Integer, primary_key=True)
    name = Column(String(250), nullable=False)


class MenuItem(Base):
    __tablename__ = 'menu_item'

    name = Column(String(80), nullable=False)
    id = Column(Integer, primary_key=True)
    description = Column(String(250))
    price = Column(String(8))
    course = Column(String(250))
    restaurant_id = Column(Integer, ForeignKey('restaurant.id'))
    restaurant = relationship(Restaurant)

As you can see, it is very easy to create classes in python/tables in database. “__tablename__” always needs to be provided. Other elements are declared like normal classes. String(n) means this String can have as many as n characters. When nullable is true, the column can never be empty.

At the end, engine needs to be started:

engine = create_engine('sqlite:///restaurantmenu.db')

Base.metadata.create_all(engine)

After this, a restaurantmenu.db file will be created at local directory and fully functional. Hooray!

That’s pretty much it. I’ll update when I learn more. Code for this post is from Udacity.