Two Approaches to Building the Recamán Sequence in Tableau

A couple weeks ago, Andy Cotgreave tweeted the following:



The tweet linked to a YouTube video by Numberphile, in which mathematics writer Alex Bellos describes the Recamán Sequence. I had personally never heard of this number sequence, but after watching the video I became very intrigued, both with the sequence itself and its beauty when drawn on an axis. Andy challenged the community to see if anyone wanted to try to build it in Tableau. This kind of stuff excites me, so I couldn’t wait to get home from work that day and give it a go. And, in my excitement, I texted my brother about it. More on that shortly…

Ken’s Solution
By the time I got home from work, I had a pretty good idea how to approach it, so I quickly found a list of the first 70 numbers of the sequence and started to visualize it. After a bit of work, I had something close, though I could tell that something was wrong with the math.
I had some family stuff to do that day, so I had to take a break from it, but later that evening I returned to it, determined to fix the math problem. After some tinkering, I found the problem and produced my final visualization:



So, how does this work? First of all, I should state that my intention was to build all of the calculated fields, etc. in Tableau, with only minimal outside data prep. Luckily for me, I had been working on a process for building arc sankeys in Tableau (blog post coming soon) so I had already worked out much of the math required for drawing the arcs and would be able to repurpose a lot of it.

The Data
My data set included a few Excel tabs. As I’d be using parametric equations to draw the arcs, one of the tabs included the t parameter values (I could have used bins, of course, but I opted to push this to the data source). Another tab is used for additional data densification required to create TO and FROM records needed to draw the lines. And the final tab included each number in the sequence. All of this data was joined together in Tableau. Then I began creating the necessary calculated fields. I’m not going to go into detail about each of the calculated fields, but let me just address the key points:
  • Each number is plotted on a sort of number line. I started out making my number line go from left to right (I would eventually swap the axes so that it went from top to bottom).
  • An arc must be drawn from one number in the sequence to the next. As I’m drawing a semi-circle, radius is important for the calculations. This radius is calculated by first finding the distance between each point then dividing by 2. A similar calculation allowed me to find the center point between each number.
  • X and Y coordinates are calculated based on the calculated radius and angle (as specified by the t values in the data source), plus adjustments based on the center point of each semi-circle. I also wanted the curves to alternate between going upward and downward, so a slight adjustment was need for the Y coordinate. The final calcs are as follows:
X Arc
AVG([Radius])*COS(AVG([T Arc])) + [Center X]-1

Y Arc
IIF([Mod]%2=1, -1, 1)*[Radius]*SIN([T Arc])

Note: The Mod field essentially counts each line being drawn, so odd numbered lines have a positive Y and even numbered lines have a negative Y.

From here, I dragged X Arc to the columns shelf and Y Arc to the rows shelf, changed the mark type to line, and dragged T Arcto the path card, so that the lines would be drawn according to my t values (see my blog post on parametric equations for more details). As noted earlier, I also swapped my axes so it would go from top to bottom. Finally, I added the option to show it in color (using the hue circle color palette) or black. And I also added an option to view it in a circular form, triangular, or octagonal. For example, here’s the octagonal version (strictly speaking, octagonal is probably the wrong term since each arc has five points, but I used that term because the result looks like octagons):



As I noted earlier, I had texted Kevin about Andy’s challenge before I began working on my solution. He was intrigued by it as well and made his own attempt at creating it. Below is an explanation of how he did it, in his words.

Kevin’s Solution
I received a text from Ken with a link to Andy's Recaman’s Sequence challenge. The precious words of my brother read: "I can't stop thinking about this". What a nerd! Truth is, within 30 seconds, I was on my computer trying to recreate it myself. (Yeah, I'm a nerd too). My ability to perform calculations in Tableau is still somewhat limited (I’m barely three months into my journey and believe me, I’m working on the calculations piece), but I was certain that I could replicate the math in Excel then drop it into Tableau.

Excel was utilized to calculate the coordinates of each move and also create an additional point between each move to help the lines be drawn. The keys are knowing when to move forward or backward, knowing when a number has already been used, and knowing whether to draw the line upward or downward. To really explain what was done would take some significant time, but I am not going to go nuts on the details, I will just generally explain the intention of each column and show the corresponding formulas and resulting data (with a subset of the data).
  • Line Number:  This was to label each point.
  • Start Point: This will be used to dictate the starting point on the line number for each line.
  • Move Length:  The sequence says to increase the length by one for each move. This represents the length of the move.
  • Radius:  When you think about it, the move length is equal to the diameter of the circle and the radius is half that length.
  • Initial Result: The sequence says that if you can move left, you will move left. This makes that determination.
  • Already Used: The sequence says that you cannot use a number twice. This makes the determination if it has been used twice or not.
  • Final Result:  This incorporates both the previous two items to determine the next point on the number line.
  • Up or Down: To draw the sequence, the first move will be up, the second will be down, then up, then down, etc.
  • Left or Right: Based on the Initial Result column, this tells you if the move will be made to the left or the right on the number line.
  • X: X coordinate for your point.
  • Y: Y coordinate for your point. (Most of these will simply be 0, which would be on the number line)

Note that there are several lines where everything but the X & Y coordinates are blank. These are points simply to guide the drawing of the line. To do this, I averaged the X coordinates of the line above and the line below so that the “drawing point” was exactly between them. The Y coordinate would be based on the Radius as well as if the move was dictated to be up or down. The result was a number equal to the radius that was positive for UP and negative for DOWN.

Formulas


Results

The result of this is a list of points each with X and Y coordinates. The even line numbers show the points on the number line. The odd line numbers show the points that will be used to help draw the lines. Below is a snapshot of what the scatter plot looks like without any lines. All points not located on the number line (Y coordinate of 0) are the points used to draw the lines.



From there, I connected to the data within Tableau. I dropped the X coordinate on the columns, Y coordinate on the rows, and placed Line Number on the detail card to create a scatter plot. I changed it to a line chart then dragged the Line Number dimension onto the path card to connect the points in that exact sequence. The result is a triangular plot of Recaman’s Sequence (which one person said looked like a wiring diagram—I'd hate to be an electrician with that job).

Below is a picture of the visualization (click on it to go to the actual viz). Please note that with some additional work and additional lines, I could calculate several additional points to draw curves rather than straight lines. This would be quite a bit more work to do in Excel.


Wrap-Up
Okay, it’s Ken again. I love that our approaches were so different. That is mostly due to the fact that I have a bit more Tableau experience than him and he’s an Excel whiz, but I think it does demonstrate a real world dilemma that we face regularly—do we prep the data up front or try to use the visualization tool to do all the work? Most of the time, I think that the answer is, in the words of Star Lord…


Of course, the precise mix depends on many things—the use case, tools available, your skill set, etc.

One key similarity in our approaches is that we both looked at this sequence and immediately said “Yes, I can do that.” And I think that’s the key point of all this. Tableau is such an amazing tool. You can quickly and easily create beautiful and insightful charts and dashboards. But the Tableau platform also allows you to do so much more—it's a data-driven drawing tool. What you can do with it is almost limitless!! If you can imagine it, you can probably visualize it.

Ken & Kevin Flerlage, June 23, 2018

4 comments:

  1. Excellent Work Ken & Kevin. I have seen Tableau Public profile of both you and both of them have fabulous vizzes. Pertaining to this viz, I have a question regarding the T arc in Ken's and Already used in Kevin's solution. How did you guys come up with this equation?

    ReplyDelete
    Replies
    1. T Arc was part of my data set. It is a set of T values needed to draw the semi-circles using parametric equations. That was a mouthful...if you don't follow me, I'd suggest reading my blog post on parametric equations: http://www.kenflerlage.com/2017/11/beyond-show-me-part-3-parametric.html. I'll let Kevin comment on the "Already Used" field in his.

      Delete
    2. [Kevin Flerlage] Thanks for the kind words. Ken is a whiz in Tableau and focused his efforts within that tool. I believe he utilized a list of already generated points on the number line for his work, then used T values to draw the circles. I am not yet there in Tableau, but can most of the work in Excel. I will be honest and tell you that I had not even considered obtaining a list of points on the number line so in my mind, I had to calculate them myself. In fact, most of the work in this spreadsheet was simply to determine the list of points to be used. (Each row that contains a start point is one that determines a point on the number line. Each row without a start point is simply used to draw the line). Recaman's Sequence says the move increases by one each time, you must move backward if you can (if not, you would move forward) and you cannot use the same number twice. The "Already Used" field is used to determine if the newly generated point (Initial Result - which was a move backward per the sequence) on the number line had been used before (Start Point). If it had not been used previously, then it would become the "Final Result". If it was used previously, then the move would go forward instead of backward and a new Final Result would be calculated. I'd be happy to share the spreadsheet with you and walk through it with you if you like. Feel free to contact me.

      Delete
    3. You can contact me @FlerlageKev on Twitter or flerlagek@gmail.com.

      Delete

Powered by Blogger.