3
\$\begingroup\$

I think the only big improvement that can be made is to check which shard to query based on the userIds of the followed users. One easy way is to check the last number of each userId (0,1,2,3,4,5,6,7,8,9) and then find which shard I need to query if I have 10.

Besides that is there anything wrong?

I am wondering if it makes sense to call the shards in a for loop, or there's a way to call it once and let it get handled by a library.

const express = require('express');
const mysql = require('mysql2/promise');

const app = express();
const port = 3000;

// Replace with your actual database connection details
const shardConfigs = [
  {
    host: 'shard1_host',
    user: 'shard1_user',
    password: 'shard1_password',
    database: 'shard1_database',
  },
  {
    host: 'shard2_host',
    user: 'shard2_user',
    password: 'shard2_password',
    database: 'shard2_database',
  },
  // Add configurations for other shards as needed
];

// Create connections for each shard
const shardConnections = shardConfigs.map(config => mysql.createConnection(config));

app.get('/user/feed/:userId', async (req, res) => {
  try {
    const userId = req.params.userId;

    // Step 1: Identify Sharded Database Shard (Hash-based Strategy)
    const selectedShardIndex = userId % shardConfigs.length;
    const selectedShardConnection = shardConnections[selectedShardIndex];

    // Step 2: Query User's Followings
    const followingQuery = 'SELECT followed_user_id FROM followers WHERE follower_user_id = ?';
    const [followingRows] = await selectedShardConnection.execute(followingQuery, [userId]);
    const followingUsers = followingRows.map(row => row.followed_user_id);

    // Step 3: Retrieve Tweets from Followed Users
    const tweets = [];
    for (const shardConnection of shardConnections) {
      const tweetsQuery = 'SELECT * FROM tweets WHERE user_id IN (?) ORDER BY timestamp DESC';
      const [shardTweets] = await shardConnection.execute(tweetsQuery, [followingUsers]);
      tweets.push(...shardTweets);
    }

    // Step 4: Sort and Combine Tweets
    const combinedTweets = tweets.sort((a, b) => b.timestamp - a.timestamp);

    // Step 5: Limit and Paginate
    // Add logic here to limit the number of tweets returned, implement pagination if needed

    // Step 6: Return the Feed to the User Interface
    const feedData = combinedTweets.map(tweet => ({
      tweet_id: tweet.tweet_id,
      user_id: tweet.user_id,
      content: tweet.content,
    }));

    res.json({ feed: feedData });
  } catch (error) {
    console.error(error);
    res.status(500).json({ error: 'Internal Server Error' });
  }
});

app.listen(port, () => {
  console.log(`Server is running at http://localhost:${port}`);
});
\$\endgroup\$

1 Answer 1

3
\$\begingroup\$

You're using mysql.createConnection(config) inside a map function which is synchronous while mysql.createConnection is asynchronous and returns a Promise thus you might end up with a Promise object instead of a connection object in shardConnections. You should await these connections inside an async function before starting your server.

async function initShardConnections() {
  for (let i = 0; i < shardConfigs.length; i++) {
    shardConnections[i] = await mysql.createConnection(shardConfigs[i]);
  }
}

// Call this function before starting your server
initShardConnections().then(() => {
  app.listen(port, () => {
    console.log(`Server is running at http://localhost:${port}`);
  });
});

Querying each shard in a loop is not the best way, as it does not utilize the potential for parallel queries. A better approach is to initiate all shard queries simultaneously using Promise.all and then combine the results thus your queries will run in parallel, reducing the overall response time.

const tweetPromises = shardConnections.map(shardConnection =>
  shardConnection.execute(tweetsQuery, [followingUsers])
);
const results = await Promise.all(tweetPromises);
const tweets = results.flatMap(result => result[0]);
\$\endgroup\$