INFO250 - Programming Languages

Lab Work: Introduction to Programming at the command line:

(9/7) Noobs use vi for the editor, ignore the other options for the time-being. At the top of php scripts, after the #!, the escape into php must be '<?php', not just '<?'

(9/26) Stats on Random and Gaussian Distributions

Warnier Diagrams for BubbleSort Demo'd and Discussed in Class...

Bubble Sort Page 1 Bubble Sort Page 2 Bubble Sort Page 3

We do not anticipate posting the code to copy, but will be pleased to provide glimpses and advice...

(9/26) Project #2 - Sorting, Min, Max, Mean, Median

Write an app for your command line at info250.us that will read data from files at /home/gschool/*.nbrs, sort them, and display the time it takes to sort the array and descriptive stats: min, max, mean, and median.

(9/30) How to calc Min and Max: Min and Max need to be set to an initial value that comes from the dataset that needs to be examined. As each $ARec is read those that are numeric get counted and inserted into an array. The best place in the script to determine Min/Max is just after N++. If $ARec is greater than the current value of $Max it replaces the current value. If $ARec is smaller than the current $Min it replaces the current $Min.

The first suggestion here is a quick fix to previous sample code and sketches. It's not very efficient since it has to check the value of $N as $ARec is inserted into an array to set the starting values for $Min and $Max, but it works. Check out 'a better way' just below for ideas for Project #3...

Code for Calc Max/Min

Here is a structured pseudocode for finding Min and Max values in a dataset peppered with invalid recoreds. It shows structure with indentation, and has a heavy php accent:

open AFile
read ARec
If !ARec
  echo Nothing there
  exit
Recs ← 1
Bad ← 0
While ARec !numeric
  Bad ++
  Read ARec
  If !ARec
    echo Bad data followed prior Bad data
    exit
  Recs ++
N ← 0; Sum ← 0; Min ← ARec; Max ← ARec
Do
  Recs ++
  If ARec is numeric
    N++
    Numbers[N] ← ARec
    Sum += ARec
    if ARec > Max then Max ← ARec
    if ARec < Min then Min ← ARec
  Else
    Bad ++
  While read ARec
Mean = Sum/N  
echo N, Min, Max, Sum, Mean, Bad

To calculate the time, place something like $StartTime = microtime(true); just before the sort starts, after loading the array. Just after the sort, before displaying the sorted array, place $FinishTime = microtime(true); When reporting the stats, calc the time using something like this: $SortTime = $FinshTime - $StartTime; BTW, microtime(false) is the default and makes spurious calculations.

(10/3) Here's the missing piece for calculating the Median from a sorted array. If there is an even number of numbers the median is the average of the two numbers at the midpoint of the array. If there is an odd number, the median is the number at the midpoint. A modular division by two will be zero for an even number. Something like this works for a 1-based array as used in the sample script, substituting your variable names:

$Mid = intval($N /2) + 1;
if ($N % 2 == 0) {
  $Median = ($Numbers[$Mid-1] + $Numbers[$Mid]) / 2;
} else {
  $Median = $Numbers[$Mid];
}

(10/10) Project #3 - Standard Deviation, Histogram, and Mode

Central Tendency

Sample runs on selected *.nbrs files ← Check your stats for Eleven.nbrs and Twelve.nbrs against these and make them match...

Structured Notation and Code Snippets

Central Tendency Structured Notation Central Tendency Central Tendency Central Tendency Central Tendency Central Tendency Central Tendency Central Tendency

Calculating the Standard Deviation for a sample or population requires two passes of the data: 1st to count, find min, max, and mean. 2nd is to calculate deviations and accumulate sum of deviations2.

Calculating the Mode is usually done on data that has been grouped within a convenient number of 'bins' containing equal ranges of the scores, visualized as a histogram. (The 'bar chart' is used when there are discrete responses, like household pets: dogs, cats, ferrets, reptiles, rodents, fish.) The mode is the midpoint of the bin with the most scores. If two bins share the highest score there is a bi-modal distribution. If several bins have high scores, that's a clue that the trait represented by the numbers is not normally distributed.,

'Deciles' are an ordinary range, where the range of scores is divided by ten, or 'Quartiles' where the range is divided by four. Quartiles can be used to plot popular 'box and whisker' graphs that show more about central tendency than a plain histogram.

Histograms with enough bins reveal whether the distribution is random, even, or gaussian. Histograms with too many bins, or plotted on individual scores, may be comb shaped and obfuscate the curvature of the data rather than demonstrate it.

Do/While Loop: In programming, it's often necessary to do something extra with the first record in a file or dataset, then process the first and subsequent records using a do/while loop. For this example, the score in the first record is used to load the starting values for Min and Max calculation, then load the array Numbers with a do/while loop.

PHPs' while, do/while, for, and foreach loops are employed to spin past bad data at the beginning of a file, load an array, sort it, calculate properties for bins, and format reports.

A side-effect of do/while is that it removes the several cpu-cycles wasted in the sample code for Project #2, where $N == 1 is tested for _every_ record read -- using the do/while loop gets about 20% improvement in cpu time...

Two-Dimensioned Arrays - Plotting a Vertical Histogram

Here are some Examples of Character Plots and the script that makes them. A vertical histogram is the last example. It uses arrays $BinMid[] and $BinN[] similar to prior projects.

Character plots on a small 'canvas' are rough when compared to plots in a graphics tool where each pixel can be calculated. The 'little box' in which a character from a fixed-width font is rendered isn't square. Most characters for a command-line interface are rendered in a rectangle with an aspect ratio like 5X9. The wide characters make it hard to plot a smooth line, so portions of curves might show up as stright or jagged lines.

The principles are similar to plotting using a graphics tool, except pixels in images are usually addressed with X=1,Y=1 as the upper-left and larger values like X=100, Y=100 as the lower-right. Character plots are used here to introduce two-dimensional arrays. Later projects for a web-deployed stats app will introduce image tools for PHP, starting with this Image Plotting Example.

Here are Vertical Histograms from data in the files at /home/gschool/*.nbrs.

Structured Notation for vertical histogram. These use the data structures from earlier projects:

CLI and Image Plots Vertical Histogram Vertical Histogram

(11/18) Browser-side Markup, Style, and Scripting: HTML5, CSS3, and JavaScript

(5 Points Each, see Notes for Due Dates) Run thru W3Schools' tutes for HTML5, CSS3, and JavaScript

show quiz scores of %90+ plus demo at your info250.us website for HTML5, CSS3, and JavaScript. As you work, make a one-page 'hello world' site that demos HTML5, CSS3, and JavaScript.

The Project: Plan for a 3-page website with consistent navigation among pages, all styled with the same external CSS3. Three parts:

Please use the references! There's a month's time to get up to speed with mobile-first, responsive design with css3, html, form elements, JavaScript, and PHP... Validate early and validate often! W3Schools, Google, and Mozilla are all excellent references:

(1/6) Database and Database Programming for the Web

'LAMP' is prevalent in job listings, stands for Linux/Apache/MySQL/PHP. A couple of other popular Ps are the modern Python and the ancient Perl. The instructor can help with PHP and asks those with no prior experience at database programming for the web to use PHP. Anybody who has already had PHP is encouraged to use these next deliverables as their first assignment witb their new programming language.

After learning SQL, these exercises use some of HTML's features for communicating with a server via HTTP: GET data included in URLs, and POST data in HTML FORM elements.

(1/6) Database and Database Programming →

(2/3) Terms of Use for INFO250.us

Please read these and be ready to provide a signed statement that you'll agree with them as we move into and master server-side scripting. Violations of the TOU warrant more sanctions than a mere MIR. We may laugh about them as we discuss past incidents, but some violations of honor and TOU have resulted in some serious changes in choice of university and professional development...

(2/3) Connecting a Website to a Database

Dlv #1, 5 Points: SeSPoP's top two buttons demonstrate using PHP to pitch web pages with data from a database or a response from an OS command. Make a new page in your website similar to LinkToReports.shtml and link to it prominently as SeSPoP Reports from your site's home page. Style your copy of the SeSPoP pages with the same external CSS as your other pages. Make it do the same GET Data reports as it does at SeSPoP, with the resulting reports styled for your site. SeSPoP uses the shtml variation on html that allows sharing common elements on web pages so things like headers and footers can be used by several pages. It also uses php scripts that respond to GET and POST data, and can access the SeSPoP database.

Dlv #2, 15 points: make your site report three novel queries work in place of each of the GET and POST data demos that are at SeSPoP. All reports must be tabulated, or optionally formatted with divs & css, to show columns and rows. Only show aggregate reports, stats, or queries delivering relatively few lines of results, please.