Welcome to the Treehouse Community

Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.

Looking to learn something new?

Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.

Start your free trial

Data Analysis Data Analysis Basics Getting to Know Your Data Charting Our Data: Part 2

Noah Fields
Noah Fields
13,985 Points

Why divide by 60 and then 24 instead of 1440?

Dividing a number by 60 and then 24, is mathematically equivalent to dividing that same number by one single value: 60 times 24. In fact, not only is this exactly equal and easily found with a calculator or even a bit of pencil-and-paper work, it also reduces the number of calculations the system has to do for each section by one. This is, to my mind, objectively superior to dividing twice as is shown before, unless perhaps 60, 24, or both, were variables (and they aren't). So, for what reasons would you, or I in the future, choose to complete the calculation the way it is shown in the video as opposed to my method?

Dave StSomeWhere
Dave StSomeWhere
19,870 Points

Maybe it is just easier for someone else (or you at a later date) to understand the purpose of the calculation

9 Answers

Steven Parker
Steven Parker
231,269 Points

If you're converting minutes into days, dividing separately by 60 and then 24 might make your intentions more clear to someone reading the formula.

Also, it's quite likely that internal optimizations will combine the values and actually do the math as you're suggesting for efficiency no matter how it is written.

Writing out the formulas can help. Convert minutes to days: 1 hour = 60 minutes, 1 day = 24 hours

To get days divide: x/60 then x/24

It’s didactic. He’s instructing us in a way that is easy to understand.

Steven Parker
Steven Parker
231,269 Points

Being easy to understand is also a good coding practice. :wink:

Can someone confirm why are converting minutes into days here?

Dave McFarland
Dave McFarland
Treehouse Teacher

mohammed ismail -- you have to convert to days because that's what Google Sheet uses when doing time comparisons.

That’s true. However if it were in a (really) frequently used loop, I would use 1440 to minimize ops. You could make it clear in comments what you are doing.

Steven Parker
Steven Parker
231,269 Points

As I said in my original answer, the language itself will most likely do that for you (optimizing the value, not writing a comment!). You could run a benchmark test to be certain.

Hambone F
Hambone F
3,581 Points

This thread is ancient, but there's an easy middle ground for this.

Define a constant equal to the multiplier before the actual operation, i.e:

MINUTES_PER_DAY = 60 * 24

Then inside your loop:

value_in_days = value_in_minutes / MINUTES_PER_DAY

I suppose it depends a bit on the language, but in general that should be just about as fast as the single literal division, and still retain the clarity of the code itself.

I ran a benchmark test. I compared one division and two divisions with 100,000,000 loops. The one division case ran in 4.34279s. The two division case ran in 4.43506s. This is 2% improvement. Google would want to save the 2% as it saves them from buying thousands of more computers for their search engine.

Steven Parker
Steven Parker
231,269 Points

But were you dividing two literal values in the benchmark?

This is my quick code. I used int literals. Let me know if I should do something differently. There's something called big O notation. The order for the first case is O(p+2n). The order for the faster case is O(p+n), where n represents multiplication and p represents other factors like the increment and checking in the loops (which should be the same). The faster case has less operations. It could be the compiler optimizes a number of things, but I am seeing a difference here. I haven't tested statistical significance.

import time

start1 = time.time()

N = 100000000

rng = range(N)

for k in rng:
    10000/24/60

stop1 = time.time()

duration1 = stop1 - start1
time_per_iter1 = duration1/N

print(duration1)

start2 = time.time()

for k in rng:
    10000/1440

stop2 = time.time()

duration2 = stop2 - start2
time_per_iter2 = duration2/N

print(duration2)
Steven Parker
Steven Parker
231,269 Points

First, move the assignment of "start1" after the assignment of "rng" to make the two code segments have the sane number of steps.

Then, run the test several times to account for variances in system performance. I tried it myself and found more than 10% variation between successive runs. And on some runs, the first time was shorter than the second time.

I also ran the test using variables instead of constants. The times were both significantly longer, and the variance between them was consistently over 40%.

My conclusion is that literal math is done prior to run time (as expected), but variable math is done as the statement is executed.

I modified the code and found that one division is faster than two divisions with 100% certainty on my computer with nothing else running. Does anything need to be modified? If not, please try on your computer so that we can generalize slightly.

import time
from scipy import stats
import numpy as np
"""
This function compares the timing for one division and two divisions with int literals.
Int literals may be handled by the interpreter in which case the timing may be the same
or very similar.  According to the python 3.7 test results of this computer, the one
division case is faster than the two division case with a p-value of 100%.  This
provides some evidence that python handles the two divisions separately.

$ python optim_test.py
P-VALUE:
1.0

ONE DIV:

Mean
7.01418685913086e-08
Standard Dev
8.734919694020465e-10

TWO DIVS:

Mean
9.183040618896484e-08
Standard Dev
8.187212683963087e-10
"""

def compute_mean(time_list):
    "Compute mean with scipy"
    return np.array(time_list).mean()


def compute_std_dev(time_list):
    "Compute standard deviation with scipy"
    return np.sqrt(np.array(time_list).var(ddof=1))


def compute_p_value(a, b):
    """
    Towards Data Science
    https://towardsdatascience.com/inferential-statistics-series-t-test-using-numpy-2718f8f9bf2f
    """
    M = len(a)
    a = np.array(a)
    b = np.array(b)
    var_a = a.var(ddof=1)
    var_b = b.var(ddof=1)
    s = np.sqrt((var_a + var_b) / 2)
    t = (a.mean() - b.mean()) / (s * np.sqrt(2 / M))
    df = 2 * M - 2
    return 1 - stats.t.cdf(t, df=df)


x = 10000
M = 25
N = 1000000

rngm = range(M)
rngn = range(N)

time_per_iter_2divs = []
time_per_iter_1div = []

for _ in rngm:

    start = time.time()
    for _ in rngn:
        x / 24 / 60
    stop = time.time()

    time_per_iter_2divs.append((stop - start) / N)

    start = time.time()
    for _ in rngn:
        x / 1440
    stop = time.time()

    time_per_iter_1div.append((stop - start) / N)

print("P-VALUE:")
print(compute_p_value(time_per_iter_1div, time_per_iter_2divs))

print("\nONE DIV:\n")

print("Mean")
print(compute_mean(time_per_iter_1div))
# print(math.mean(time_per_iter_1div))

print("Standard Dev")
print(compute_std_dev(time_per_iter_1div))

print("\nTWO DIVS:\n")

print("Mean")
print(compute_mean(time_per_iter_2divs))
# print(math.mean(time_per_iter_2divs))

print("Standard Dev")
print(compute_std_dev(time_per_iter_2divs))
Steven Parker
Steven Parker
231,269 Points

I had trouble installing scipi. Well, it appeared to install OK, but when I ran the program I got errors (from scipy.special._ufuncs). However, here's the per-iteration times from two consecutive runs of the previous program:

Attempt Two divides One divide
First run 6.893455743789673e-08 6.89247965812683e-08
Second run 6.30212664604187e-08 6.354840755462647e-08

I don't think the two divides are actually faster, but I do think the optimization is actually converting it into a single divide. If you get different results, it's certainly possible that your Python version doesn't perform the same optimizations. I first encountered this kind of optimization in a different language, and it was dependent on the compiler there.

I guess the bottom line is if you're not sure your system optimizes literal math, and your program will be doing rapid calculations, your approach of hand-combining the literals (with appropriate comments) is probably a good idea.

But if you know your optimization handles it, or if the program will only use the calculation infrequently, the clarity of showing the complete calculation might make it the best choice.

I reposted the code after changing the compute_p_value function. It may run for you now. It's interesting that it doesn't appear to be generalizing. I didn't find a case where the first run was slower than the second run like you did. It sounds like a fine summary. I wonder if the instructor Kenneth Love would know more.

Steven Parker
Steven Parker
231,269 Points

Unfortunately for us, Kenneth has moved on to other opportunities. But perhaps one of the current instructors may comment.