Strive away from ‘hobbyist’ MCUs, particularly anything RPi. Poor documentation, may implement slow ‘mass storage bootloader’, non-standards like mailboxes etc.
Could prevent having to create own PCB if can buy say in a pro-micro form factor
TRRS (tip-ring-ring-sleeve), i.e. 4-pole, i.e. 4 wires
RF-shielded transciever (transmitter + reciever)
Use neokey 1x4 to test with debouncing, key matrix etc.
TODO: Have browser new-tab page load a local .html with organised frequently visited links, e.g. SEARCH: google; TOOLS: github, stackoverflow; etc.
smart device just means software interaction capabilities to allow automation
qwerty, dvorak and colemak keyboard layouts
TODO: Phil’s lab for embedded hardware
TODO: rxi microui library demo
Interesting Trenchboom 2 quake map file level editor (easy to parse?)
!!
operator to 0 or 1:
t += ((F32)!!(flags & flag) - t) * rate;
IMPORTANT: dlclose() will only unload if reference count 0. So, with thread_local variables will not actually unload
niagara renderer first 3D game: https://www.pardcode.com/cpp-3d-game-tutorial-series
TODO: Windows WM_PAINT when window moved/obscured, main loop otherwise therefore, 2 places to possibly draw. does X11 have this also?
resilio sync video downloads: https://hero.handmade.network/forums/code-discussion/t/3242-downloading_videos_using_resilio_sync_formerly_bittorrent_sync
hotloading metaprogramming
X11 copy and paste: https://handmade.network/forums/articles/t/8544-implementing_copy_paste_in_x11 font rasteriser: https://handmade.network/forums/wip/t/7610-reading_ttf_files_and_rasterizing_them_using_a_handmade_approach
Often legacy supported code adds lots of corner cases, e.g. opengl i.e don’t have consistent performance characteristics
immediate mode and retained mode are about lifetimes. for immediate, the caller does not need to know about the lifetime of an object.
discovering for linux docs (xlib, alsa) are code and for development may have to add to groups (uinput)
plugins just be .so files that are loaded with dlsym()
Use of discriminated unions and type fields for type sharing
Visualising essential for debugging insidious problems… Print rows of
text aligned Conversion from single-state to recording data over time
will require memory which we will cast to appropriate state structure,
e.g. DebugState state structures will have
b32 is_initialised
field which will initialise a memory
arena Then have CounterState and CounterSnapshot (move per frame data
into this) Each frame, copy over src to dst (with a rolling buffer) In
the rendering of the records, to display snapshots, generate statistics
like min, max, avg. These will be converged into a Statistic struct for
each value Have helper function for begin, update and end statistic with
value Now we have max, loop over again to generate graph height scale
and say red colour scale Alternatively we could have absolute height
scale, e.g total / 0.033f
(So in drawing, will typically
have ‘raw’ structures that just have data, and will then loop over these
again to generate relationships to actually draw from etc.)
(TODO: UI have layout and font information…) Drawing have left_edge,
top_edge Drawing charts, define chart_height, bar_width, bar_spacing
etc. in pixels Can draw single pixel height reference line Cycle through
colours with arr[index % ARRAY_COUNT(arr)]
Drawing just take state, and input. After base drawing, look at input and alter if appropriate
Draw routines with origin vector and vector axis (basis vectors) Axis that aren’t perpendicular causes shearing
artist creates in SRGB space (in photoshop) so if we do any math on it (like a lerp), will have to convert it to linear space and then back to srgb for the monitor (if we were to just blit directly, it would be fine) to emulate complicated curves, could table drive it we compromise on r² and √2 without gamma correction, resultant image will look very dim (due to nature of monitor gamma curve) so with rgb values, preface with linear or srgb
IMPORTANT: ALWAYS PROVIDE EXAMPLE!
TODO: raylib game programming (perhaps also for byte jams as well)
TODO: IoT embedded project
TODO: incorporate rss reader feeder.co into morning routine
TODO: memfault debugging with IoT https://embeddedartistry.com/wp-content/uploads/2022/07/IoT-Device-Observability-Requirements-Checklist.pdf?mc_cid=582b1ec73e&mc_eid=UNIQID
working with MPU: https://www.state-machine.com/null-pointer-protection-with-arm-cortex-m-mpu?mc_cid=582b1ec73e&mc_eid=UNIQID
RTOS debouncing: https://blog.golioth.io/how-to-debounce-button-inputs-in-a-real-time-operating-system/?mc_cid=582b1ec73e&mc_eid=UNIQID
TODO: wolfsound youtube for some audio DSP
TODO: Fun graphical displays https://tcc.lovebyte.party/ YouTube LoveByte and FieldFX byte jams
TODO: how to best architect timing requirements given that ISR should be small? e.g. this executes every 1 second, this every 500ms etc.
Thermal imaging coupled with camera more ‘deterministic/maintainable’ than AI vision
There are of course international standards (IEC 60601, or IEC 62304) that you have to follow in order to get your device CE/FDA approval. These standards usually require to provide a lot of documentation and to pass a series of tests in order to verify that your device is working as intended and it’s not dangerous.
could even be simplex (one way only) full-duplex can still be serial, requiring at least two wires (duplex meaning bidirectional) serial (synchronous with clock data; asynchronous) + parallel
If you put “used Bluetooth” then I would expect you to talk about which transceivers you used, what the api was like (deficiencies it had, how low level, how was it good), and of course a high level overview of where the Bluetooth spec ends and application begins.
Simply bieng able to say “I don’t know, but I do know that in the spec/experience it says xyz, so maybe it would do abc, but I would have to look for jkl to be sure.” is huge.
BUTTON BOX 3D PRINT: * print out small test part with holes to test if button/screw/etc. inserts fit correctly * consider leaving holes in 3D scaffolding for making wiring easier * for stylistic finish, can print a separate plate with different resin and attach with hot glue * will have a bottom plate to account for wiring (have threaded inserts (male-to-female) to account for box moving about when pressing a button) 1. solder wires to buttons 2. solder wires to mcu 3. secure mcu
no heap allocation, just use statically allocated pools: https://mcuoneclipse.com/2022/11/06/how-to-make-sure-no-dynamic-memory-is-used/?mc_cid=26981ac7f4&mc_eid=UNIQID
Fast Fourier Transform (FFTs) are often used with DACs to create a spectrum analyser which allows for subsequent beat detection?
an adapter board would handle power and signal conversion
tiny solid-state batteries more efficient that coin cell batteries; useful for wearables
TODO: go through https://embedded.fm/blog/embedded-wednesdays
Dagger CI in code not YAML Qucs-S for SPICE, free circuit simulator GUI
TODO: possible program usage of syncthing
bipartite ring buffer more efficient? TODO: seems with RTOS emphasis on ‘tasks’, will require understanding job queues https://www.aosabook.org/en/freertos.html
filters relating to a pedometer: https://www.aosabook.org/en/500L/a-pedometer-in-the-real-world.html
TODO: richard braun embedded (useful C has functions) https://www.twitch.tv/videos/233685076?filter=all&sort=time
I only touch low-voltage 5V side, the high ‘dangerous’ voltage side I leave to hardware engineers
TODO: rhymu ‘rusty keyboard’ series
TODO: FreeRTOS spectrum analyzer, i.e. using ‘tasks’: https://www.youtube.com/watch?v=f_zt7zdGJCA
https://iosoft.blog/2020/09/29/raspberry-pi-multi-channel-ws2812/ addressable rgb, each LED has serial chip with a serial port in and serial port out that it funnels through the chain (interesting option of having a light display at sunset) using PlatformIO (python under the hood) very slow. perhaps though, it’s useful for finding libraries or possible environment options to explore like unit testing BOARD-SELECTION: (want a board with WiFi and a builtin display?) IMPORTANT: chip cost may be $4, board cost $15 go with ARM (although ESP32 much more powerful than Atmel, e.g. RAM, peripherals, frequency similar cost)
LEDS: tricolor led can be any 3 colours. RGB LED is specific colours we power level appropriate from cable, board will enter program mode automatically resolution of PWM setup dependent on the frequency set up for it (e.g. resolution of 8 gives 100% duty cycle of 255)
red_led_duty_cycle(pwm_max_val - red_component);
green_led_duty_cycle(pwm_max_val - green_component);
blue_led_duty_cycle(pwm_max_val - blue_component);
TODO: is brightness of PWM determined by duty cycle? If so, then a PWM LED fade should just be duty cycle, or should it be colour as well?
DISPLAY: useful to display FPS (might only need to display this say, every 250ms as oppose to every frame), power consumption (perhaps unscaled power, i.e. power if not performing alterations; perhaps only bright LED is power throttling) technology, size, connection as various permutations of these, have various controller permutations e.g. SH1107_I2C, SSD1306_SPI, etc. (we probably want a library to draw lines, shapes, fonts, etc.) I2C developed by Phillips to allow multiple chips on a board to communicate with only 3 wires (id is passed on bus) (number of devices is limited by address space; typically 128 addresses?) price difference between a shape drawable display and character display?
// Moire pattern
for (u32 x = 0; x < display_width; ++x)
draw_line(x, 0, display_width - x, display_height);
// Cocentric circles
for (u32 r_count = 0; r_count < 3; r_count += 1)
draw_circle_outline(x, y, r_count);
// rainbow
light_led(led_i, colour_val += 10);
// Marquee (draw colour, then draw moving black spots)
for (u32 i = scroll % 5; i < num_leds; ++i) set_led(i, black);
// stars
leds[random(leds_len)] = colours[random(colours_len)];
delay_ms(200);
// might be better to draw a single LED each pass and use a LOCAL_PERSIST to
// keep track of how many times function is called as have delay
// comet
u32 trailer_pixel_fade_amount = 128;
u32 num_core_pixels = 5;
u32 delta_hue = 4;
r32 comet_speed = 0.5f;
// instead of clearing all LEDs, reduce brightness, i.e. incrementally fade to black
led_fade_to_black(leds, i, amount);
u32 hue = HUE_RED;
hue += delta_hue;
i32 direction = -1;
u32 pos = 0;
pos += (direction * comet_speed);
// IMPORTANT: to get a 'smooth' effect, require floating point speed multiplier
// IMPORTANT: so, animation stages are u32, r32, then easing (kinematics/lerp)/cycling(sin) etc.
// IMPORTANT: gradual colour blending to clear, instead of instant colour clear
// IMPORTANT: then move to perhaps altering colour
// IMPORTANT: when calculating values, the natural derivation may prove too large or too small,
// e.g. cur_time - prev_time. so, add something like a 'speed_knob' that will / or *
// IMPORTANT: floating point coordinates only one part of smoothness.
// also require fractional drawing
// this means that drawing 0.25 of a pixel will be 0.25 of its original colour
colour colour_fraction(colour_in, fraction)
{
fraction = min(1.0f, fraction);
return colour.fadeToBlack(255 * (1.0f - fraction));
}
// IMPORTANT: so, in addition to 'sub-pixel' drawing, also have automatic blending
void draw_pixels(ipos, pixel_len, colour)
{
first_pixel_avail = 1.0f - (fpos - (int)fpos); // get fractional part
first_pixel_avail = min(first_pixel_avail, pixel_len);
remaining = min(count, strip_len - fpos);
ipos = fpos;
if (remaining > 0.0f)
{
// ipos = get_pixel_order();
set_led(ipos++, colour_fraction(colour, first_pixel_avail));
remaining -= first_pixel_avail;
}
while (remaining > 1.0f)
{
set_led(ipos++, colour);
remaining--;
}
if (remaining > 0.0f)
{
set_led(ipos++, colour_fraction(colour, remaining));
}
}
// IMPORTANT: now increment by say 0.1 instead of 1
// draw comet block
// the delay() value will probably be specific to the CPU frequency
// Check how fast we can draw this out over I2C (want something like get_ms())
// use weighted average to prevent value flickering, i.e. jumping (it will take some number of frames to stabilise as starts at 0)
weighted_average = prev_value * 0.9 + new_value * 0.1; (probably use LOCAL_PERSIST)
free-fall: v = sqrt(-2 * gravity * distance_fallen);
r32 gravity = -9.81f;
r32 start_height = 1;
r32 impact_velocity = free_fall(start_height);
r32 speed_knob = 4.0f;
struct BouncingBallEffect
{
u32 output_led_strip_len;
u32 fade_rate;
b32 reversed (draw from len inwards; i = size - 1 - i); // common parameter for effects
b32 mirrored (size = size / 2); // common parameter for effects
r32 time_since_last_bounce;
r32 ball_speed;
r32 dampening; // value closer to 1 preserves more energy (0.9f - i / pow(num_balls, 2))
u32 colour;
};
time_since_last_bounce = time() - clock_at_last_bounce;
// constant acceleration function (here gravity is decellerating us)
// this could be considered an easing function
ball_height = 0.5 * gravity * pow(time_since_last_bounce, 2.0f) + ball_speed * time_since_last_bounce;
if (ball_height < 0) // bounce
{
ball_height = 0;
ball_speed = dampening * ball_speed;
if (ball_speed < 0.01f) ball_speed = initial_ball_speed;
}
position = (ball_height * (strip_len - 1) / start_height);
set_led(position, colour);
// IMPORTANT: as no need for explicit collisiong detection,
// if we do additive colours, led[i] += colour; we get mixing
if (mirrored) set_led(strip_len - 1 - position, colour);
print(" " * sin(freq * i) + "ryan") // map sin() to our desired range
// could also have sawtooth(), cubic() etc.
#define cli() nvic_globalirq_disable() #define sei() nvic_globalirq_enable()
// fractional drawing required for smooth 'slow-motion' effects
blend_amount_self = 2, blend_amount_neighbour1 = 3 (take 2 parts of itself, 3 parts of neighbour)
blend_total = 5;
// for parameters essentially / for setting relative to length,
// + for offset, and * for a scaling factor
// cool
for (u32 i = 0; i < size; ++i)
heat[i] = max(0L, heat[i] - random(0, ((cooling * 10) / size) + 2));
// move heat up
for (u32 i = 0; i < size; ++i)
heat[i] = (heat[i] * blend_self + heat[(i + 1) % size] * blend_neighbour) / blend_total;
// ignite a spark
for (u32 i = 0; i < num_sparks; ++i)
{
if (random(255) < sparking_threshold_probability)
{
u32 y = size - 1 - random(spark_height);
heat[y] = heat[y] + random(160, 255);
}
}
// convert heat to colour
for (u32 i = 0; i < size; ++i)
colour = map_colour_to_red(heat[i]);
draw_pixels(i, 1, colour);
FAN_STRIP_SIZE = 16;
NUM_FANS = 3;
ZERO_PIXEL_OFFSET = 4; (how much have to rotate by for 0th pixel to be on bottom)
enum PIXEL_ORDER
{
PIXEL_ORDER_SEQUENTIAL = 1 << 0,
PIXEL_ORDER_REVERSE = 1 << 1,
PIXEL_ORDER_BOTTOM_UP = 1 << 2,
PIXEL_ORDER_TOP_DOWN = 1 << 3,
PIXEL_ORDER_LEFT_RIGHT = 1 << 4,
PIXEL_ORDER_RIGHT_LEFT = 1 << 5,
};
u32 get_pixel_order(i32 pos, PIXEL_ORDER order)
{
// this allows negative indexing, i.e. -1 is last
while (pos < 0) pos += FAN_SIZE;
u32 offset = (pos + ZERO_PIXEL_OFFSET) % FAN_STRIP_SIZE;
u32 reverse_offset = (pos + FAN_STRIP_SIZE - ZERO_PIXEL_OFFSET) % FAN_STRIP_SIZE;
// start in particular fan, e.g. 0, 16, 32, 48, 64, etc.
u32 fan_base = pos - (pos % FAN_STRIP_SIZE);
switch (order)
{
case PIXEL_ORDER_SEQUENTIAL:
return fan_base + offset;
case PIXEL_ORDER_REVERSE:
return fan_base + FAN_STRIP_SIZE - 1 - reverse_offset;
case PIXEL_ORDER_BOTTOM_UP:
return fan_base + FAN_STRIP_SIZE - 1 - (VERTICAL_LOOKUP[pos % FAN_STRIP_SIZE] + ZERO_PIXEL_OFFSET) % FAN_STRIP_SIZE;
}
}
// IMPORTANT: connecting multiple strips is daisy chaining, so still same output
// allow for offset from fan_index, and also just passing number treating all fans together
// IMPORTANT: optional arguments, use struct
for (u32 i = 0; i < NUM_FANS; ++i)
draw_fan_pixels(sin(x), 1, red, PIXEL_ORDER_LEFT_TO_RIGHT, i);
for (u32 i = 0; i < NUM_PIXELS; ++i)
draw_fan_pixels(sin(x), 1, red, PIXEL_ORDER_LEFT_TO_RIGHT, 0);
boustrophodon: 0 > 1 > 2 > 3 > 4 | 8 < 7 < 6 < 5 <– IMPORTANT: LED matrix arranged as boustrophodon columns if (x % 2 == 0) return (x * height) + y; else return (x * height) + (height - 1 - y);
FFT divides samples into frequency buckets logarithmic scale employed in spectrum analyser to account for high frequency range being more greatly separated than low frequency
wire strippers +/- affects tension which will affect any wire braiding
put heat shrink on before wire soldering
will often twist multiple common wires (e.g. ground) together and solder to single ground source
(have gloves on for this)
wiring can be too fine for holding a particular amperage?
TODO: look into sleeving wires
TODO: something like loctite threadlocker is a liquid that prevents threads loosening due to vibration
Adhesive tape (with conductive pads on end) on bottom of LED strip stick wires to.
Then solder end tips to connect bottom to top
(have some electrical tape for this)
Also, buy wires merged together and only pull apart the end points
With multimeter, verify that no short circuit by checking continuity amongst wires
Critical section means non concurrent access to this code.
Acheived with a mutex obtain and lock pair?
Modern OS will have code memory write protected for security reasons. Bare metal can do this however
IMPORTANT: for a preemptive multitasking kernel like linux, a call to pthread_yeild() (allow other threads to run on CPU) is not necessary. however, for embedded, maybe
LEDS: WS2812B is standard. Neopixel brand. Each chip is RGB LED with MCU. probably has low drop-out regulator onboard so can pass 5V to 3.3V can buy as grids or strips have a library to generate the square wave forms, e.g. FastLED
by powering board with pins, can ensure enough current is passing to it can simultaneously connect USB cable and will defer to power pins over it?
With power, still have to be careful if say, setting max brightness and all LEDS to white seems it’s common to have GBR as format? TODO: only in video 4 are power calculations done? (have a function to limit max watts of power drawn?)
// #define AT_MOST_N_MILLIS(NAME,N) static AtMostNMillis NAME(N); if( NAME ) // operator bool() { return ready(); }
8BIT MATH (another reason to inspect assembly): // qadd8( i, j) == MIN( (i + j), 0xFF ) (saturated add) // qsub8( i, j) == MAX( (i - j), 0 ) // mul8( i, j) == (i * j) & 0xff // add8( i, j) == (i + j) & 0xff // sub8( i, j) == (i - j) & 0xff // sin8( x) == (sin( (x/128.0) * pi) * 128) + 128 // cos8( x) == (cos( (x/128.0) * pi) * 128) + 128 #if QADD8_C == 1 unsigned int t = i + j; if( t > 255) t = 255; return t; #elif QADD8_ARM_DSP_ASM == 1 asm volatile( “uqadd8 %0, %0, %1” : “+r” (i) : “r” (j)); return i; #endif
TODO: gdb scripts https://github.com/espressif/freertos-gdb
TODO: is an IoT cloud data store like ThingSpeak used commercially?
TODO: investigate Renode for embedded testing
TODO: networking with containers: https://iximiuz.com/en/series/computer-networking-fundamentals/
TODO: look at embeddedartistry course developments for ‘building testable embedded systems’
TODO: working with time-series data
LIN protocol for vehicles (seems to be a host of protocols specific to automotives)
trimpot a.k.a potentiometer
MEMS (micro-electromechanical systems) motion sensor
TODO: using in-built security features of chip, e.g. AES-256
buck converter steps down DC-DC voltage, while stepping up current (various step-down mechanisms in relation to AC/DC and voltage/current)
although bluetooth LE say 50m distance, a repeater can be used (and really for any RF)
TODO: profiling not your application https://linus.schreibt.jetzt/posts/qemu-9p-performance.html
TODO: https://dev.to/taugustyn/call-stack-logger-function-instrumentation-as-a-way-to-trace-programs-flow-of-execution-419a
MIPI (mobile industry processor interface) developed DSI
TODO: outdoor project: https://hackaday.io/project/186064-green-detect look into further hackaday.io/projects
TODO: embed LEDs onto 3D printed board
contrasting telephoto and wide angle lens
TODO: differences between ribbon cable and normal
TODO: TPM (trusted platform module) for embedded https://www.amazon.com/Trusted-Platform-Module-Basics-Technology/dp/0750679603
TODO: freeRTOS https://www.digikey.com/en/maker/projects/getting-started-with-stm32-introduction-to-freertos/ad275395687e4d85935351e16ec575b1
TODO: bootlin courses
TODO: GPS tracking, LORAwan (long range, low power), satelitte connection, etc.
TODO: 3D printing; outdoor case enclosure
Digital laser dust sensor (particulates in the air). PM1 being worst as most fine TODO: DC vs stepper motor? TODO: what is the use case for a motor driver such as L298 Dual H-Bridge Motor Driver and Tic T500 USB Multi-Interface Stepper Motor Controller (circuitry without MCU?) driver lines are diode protected from back EMF?
Growth of ‘enviro sensor kits’, i.e. test air quality etc. to create smart home or garden Growth of ‘IoT’ sensor kits/smart home Growth of ‘AI sensors’ (Perhaps more relevent to me is power/energy and automation and sensing) Growth of ‘IoT’ sensor kits Growth of sensor compounding, e.g. video now with LiDAR to detect depth, gesture detection sensor
H-bridge is IC that switches voltage polarity, e.g. run DC motors forwards or backwards Rectifier converts AC to DC (transformer is high voltage AC to low voltage AC)
TODO: is something like SerialStudio necessary for visualisation? Could not live stream gnuplot?
TODO: domestica course on creative coding. css, html graphics also for 2D animations?
Marketing refers to some MCU work that react to environment as physical computing
TODO: investigate both AVR github and gists: https://github.com/BobBurns
CFFI to create a python interface for C for something say like a console session?
TODO: Perhaps for a machine just targeting embedded development, use WSL to easily use GUI debugger apps?
TODO: Essentials of putting metadata section in firmware binary, such as version string! (particularly so test runners can utilise this information)
Might see GNSS + INS (inertial navigation system; i.e using IMU as well)
TODO: amplifiers, e.g. class-D etc. MEMs accelerometers for vibration detection in cars
TODO: RTC is external to oscillator?
Analog LED is single colour Digital LED allows controlling each colour separately (so, a.k.a e.g addressable APA102 LEDs) This is done through an LED chip (think WS2812B LED chip for NeoPixel)
A channel in a sensor is quantity measured. So an acceleration sensor could have 3 channels, 1 for each axes
DSP: http://www.dspguide.com/pdfbook.htm terms like THD (total harmonic distribution), PFC (power factor correction), HV (high voltage)
circuit design in software (perhaps parallel with Robert Feranc?): https://www.jitx.com/
Go through talks on memfault.com blog
Simulator testing gives much faster turn-around times, can add sanitisers without memory concerns, pass peripheral data in from file, draw to window instead of display etc.
For memory allocator, set to 0 on free in debug builds? Investigate gcc tunables, e.g. in debug build: export MALLOC_PERTURB_=((RANDOM % 255 + 1))
network game (with some testing tools): https://github.com/TheSandvichMaker/netgame
TODO: how to calculate: “Its power supply system can supply up to 4200 mAh and run for more than 5 hours” SiC (silicon carbide) power module Whole area in power management (also leads onto safety regulations, e.g. SIL)
use time-series database where everything is stored ordered (as oppose to relational based on set thoery whose order is determined in clauses. also get full SQL support here and can store more data types)
investigate automated ‘agriculture’, e.g smart argicultural kit, automatic pot plant watering, etc.
databases useful for tracking time series IoT sensor data
how to overclock and underclock? how are these different to adjusting clock scalers?
awesome escape puzzles: https://www.youtube.com/c/PlayfulTechnology
investigate cpu fault handling: https://github.com/tobermory/faultHandling-cortex-m
DC motor: raw PWM signal and ground signal controls speed high rpm, continous rotation (e.g. fans, cars) servo: dc-motor + gearing set + control circuit + position sensor signal controls position limited to 180° accurate rotation (e.g. robot arms) stepper: can be made to move precise well defined ‘steps’, i.e. jumps between electromagnets position fundamental (e.g. 3D printers)
AVR used in lower-end (8bit) as less complex, cheaper than ARM
TODO: working with thumb instructions TODO: best practices monitoring systems in the field (4G?)
note, if using newlib, will still have a _start
May compile for different architectures in embedded for different product lines e.g. low-end fit bit, high-end fitbit
Compile with different compilers to see performance benefits at end. Also possible may have to use a particular compiler for specific hardware.
In reality, don’t want an RTOS if timing very critical. Most MCU don’t even have multiple cores so adding software overhead (or are there additional hardware to emulate cores?) Even without RTOS, will most likely have some timer and separation of tasks. Main overall benefit of RTOS is consistent driver interfaces across multiple mcus (useful for say complex bluetooth/ethernet stacks etc.) Most mcus are overpowered for what they do, so using an RTOS is probably a good idea.
tamper response usually done with a button on the board that gets activated when case opens this will trigger an interrupt handled by RTC (real time clock)
cpu has modes like Stop mode that is a power saving mode.
battery balancing relevent to multiple cells TP4056 Lithium Battery Charging Board?
verification: requirements validation: does it solve problem authenticate: identity authorise: privelege token something used to authorise
HTML not turing-complete, i.e. can’t perform data manipulations
robot ➞ renode (why not just qemu with shell scripts perhaps?) seems that robot/renode parsing of UART cleaner?
embedded systems special purpose, constrained, often real time (product may be released in regulated environment standardsd, e.g. automotive, rail, defence, medical etc.) challenges are testabilty and software/hardware comprimises for optimisation problem solving, e.g. bit-banging or cheap mcu, external timer or in-built timer, adding hardware increases power consumption, e.g. ray tracing card or just rasterisation, big.LITTLE clusters 1/4 scalar performance for 1/2 power consumption good tradeoff
what would a segfault ➞ coredump look like on bare-metal?
C type qualifiers are idempotent,
e.g. const const const int
is fine sparse + smatch static
analysis tools that give named address spaces (near, far pointers)
synchronisation constructs: * lock (only one thread access at a time) * mutex (lock that can be shared by multiple processes) * semaphore (can allow many threads access at a time)
is ARM nested interrupt by default?
C atomics just insert barriers?
function reentrant if can be swapped out and its execution rentered functions that operate on global structures and employ lock-based sychronisation (could encounter deadlocks if called from signal handler) (and are thread-safe) like malloc and printf are not reentrant
qemu useful over native for when word-size different. also, assembly inspection
TODO: DSP, RTOS, wireless + IoT, battery/power, peripheral protocols (USB, LCD, etc.), CODECS, optimisations particular to MCUs, assembly knowledge, sql, c++ stl, python systems testing (continous integration), bootloaders
ARM, RISC-V, xtensa
https://drh.github.io/lcc/
Garbage collection replaces us with a ‘search’ of our memory and decides when to free So really only use if you can’t figure out when to free something This search is not free
No language implements a feature that determines a memory footprint, i.e. how much memory we use
Know OS specific forking (process launching), file checking/reading/writing, IPCs
Steps for adding new feature: create a new file, e.g. commands.c and include it in main.c (order it before things that will use it) then do a git commit. This gets us off on a nice feeling In new file, create function with barebones functionality that can then be inserted into location and called/tested
Simple word parser use
at[1] != '\0' && at[1] == 'a'
Simple single line parser: First separate by whitespace with
while (at[0] != '\0'); break
and
eat_spaces(), find_ch_from_left()
3D printing ideas: https://www.youtube.com/c/3DSage/videos
michael ee for RTOS: https://www.youtube.com/playlist?list=PLLYZoEqwvzM35p2Kc7bk7bkwxLtTVwpvy
A real time scheduling algorithm is deterministic (not necessarily fast), i.e. it absolutely must (soft time is it should) (real time processing means virtually immediately) So, a higher priority task will preempt lower priority tasks FreeRTOS will have a default idle task created by the kernel that is always running (this idle task gives indication of a low-power mode for free?)
middleware extends OS functionality, drivers give OS functionality
freeRTOS makes money through some commercial licenses (with support), middleware (tcp/ip stacks, cli etc.)
TODO: setting freeRTOS interrupt priorities is sometimes done wrong? tasks are usually infinite loops
freeRTOS is more barebones (only 3 files) and effectively just a scheduler (so has timers, priorities) and communication primitives between threads In most embedded programs, sensors are monitored periodically. Time and functionality are closely related For small programs super loop is fine. However, when creating large programs, this time dependency greatly increases complexity. So, a priority based real-time scheduler can be used to reduce this time complexity. (priority over time-slice more efficient in most cases, as if not operating, can go to sleep) In addition, schedular allows for logical separation of components (concurrent team development) Also allows easy utilisation of changing hardware, e.g. multiple processor cores COTS (commercial-off-the-shelf) as opposed to bespoke
SEPARATE API CALLS TO AVOID LOW PRIORITY CORRUPTING BY BEING INTERRUPTED BY HIGH PRIORITY Define a context structure that will hold api call information
typedef struct file_api_params_s { uint8_t api_opcode; uint8_t control_opcode; uint8_t file_handle; semaphore sem; void * param1; size_t param2; } file_api_params_t;
Depending on the RTOS you can actually define your messages and message pool ahead of time - or you’ll be providing a pointer to the parameter structure in your message file_api_params_t file_api_pool[n]; //where n = number of tasks that can access the file system + 1;
Each api function does the following: - allocates a free param structure from the pool - populates the parameters from the api call - initializes the semaphore - allocates and initializes a message to send to the file system task - adds the message to the queue - wait for the semaphore
The filesystem task does the following: - waits for messages from the message queue - checks the message validity and that the sender still is valid (for some systems that have bad behavior if the semaphore api does not check for validity) - performs the requested operation - increments the semaphore - goes back to waiting for a message So, filesystem task is assigned high priority
If taskA dependent on taskB, taskB should be of higher priority than taskA
zephyr more of an RTOS a step towards linux with lots of drivers e.g. LVGL, LittleFS, etc. (with this comes a lot more configuration hassle)
TODO: zephyr series (perhaps better to do this first as an RTOS as more modern resources than FreeRTOS?) https://training.golioth.io/ https://blog.golioth.io/zephyr-threads-work-queues-message-queues-and-how-we-use-them/?mc_cid=f6bdf458d1&mc_eid=UNIQID https://blog.golioth.io/taking-the-next-step-debugging-with-segger-ozone-and-systemview-on-zephyr/
lego technics for gear ideas: https://www.youtube.com/playlist?list=PLJTMHMAVxQmvTGatIj-X2rJrN8aGJEaQD
want to make something like stargate wheel
first step in embedded debugging commandments; thou shalt check voltage (e.g. check 5V going to LCD by placing multimeter on soldered pin heads)
we can see in x86 (wikichips), instruction and data separate from L2 cache arm SoC block diagram (datasheet), see d-bus and i-bus to RAM introduce things like CCM (core coupled cache) and ART (adaptive real-time accelerator) that add some more harvard like instruction things essentially, more busses instead of more cores like in x86, (i.e. a lot more than just a CPU to be concerned with) also have more debug hardware don’t really know why cortex-m4 has MPU
AXIM, AHB and APB ARM specific will have a bus matrix (which allows different peripherals to communicate via master and slave ports by requesting and sending data) off this, have AHB (higher frequency and higher bandwidth). like to think of AHB as host bus as it feeds into APBs via a bridge. APB1 normally half frequency of APB2 we can see that DMA can go directly to APB without going through bus matrix So, relevent for speed and clock concerns going through which bus? Also to know if we are DMA’ing something and CPUing something, they are not going to be fighting on the same bus, i.e. spread out load lower power peripherals on lower frequency busses? this level of knowledge further emphasises need to know hardware to understand what is going on
https://go.memfault.com/debugging-embedded-devices-in-production-virtual-panel?mc_cid=32b3cae3e7&mc_eid=UNIQID https://go.memfault.com/embedded-device-observability-metrics-panel-recording?mc_cid=32b3cae3e7&mc_eid=UNIQID
use specifically for understanding mesh networks in context of bluetooth?: https://academy.novelbits.io/register/annual-membership?_gl=115qz5rp_gaNzM2MjQ4ODUzLjE2NTY0OTIwNjc._ga_FTRKLL78BY*MTY1OTU3OTAxNS4xLjEuMTY1OTU3OTIyMy4w&_ga=2.149910452.341656361.1659579015-736248853.1656492067
https://interrupt.memfault.com/blog/ota-delta-updates?utm_campaign=Interrupt%20Blog&utm_medium=email&_hsmi=222505339&_hsenc=p2ANqtz-9kmqPywlKxifduWJneXhUh1h_RQ4bf-v41o2qF8iBciZYc9beFlhwM4EiOVbP3DKUl8kxc_4GOIdpzvkJi5iOGzgwSWA&utm_content=222505339&utm_source=hs_email
automated crash reporting: https://lance.handmade.network/blog/p/8491-automated_crash_reporting_in_basically_one_400-line_function#26634 in embedded, how are metrics transmitted remotely, i.e. via bluetooth to phone than web? too much power if directly to web?
something related to HIL (hardware-in-loop testing) https://blog.golioth.io/golioth-hil-testing-part1/?mc_cid=da33e3796b&mc_eid=UNIQID
something related to systems testing with aardvark SPI/I2C adapter (more tutorials with bus pirate) 10:00 time mark: https://www.youtube.com/watch?v=N60WSQc-G_8&list=PLTG9uzDd_HQ84wVz0DwQ5_mwf1GnpY6LB&index=11 seems indepth bus pirate manual is on git? http://dangerousprototypes.com/docs/Bus_Pirate https://learn.sparkfun.com/tutorials/bus-pirate-v36a-hookup-guide/all (look for device specific tutorials on bus pirate website) http://www.starlino.com/bus_pirate_i2c_tutorial.html
seems that RMII (reduced media-independent interface) is a pin layout to connect MAC devices (flexible in implementation). Can be implemented to support say an RJ45 connector
stm32 datasheet and reference manual (documents of different depths about same mcu) nomenclature will have ‘Application Notes’ that detail specific features like CCM RAM datasheet will often be related to a family, e.g. stm32f429xx. therefore, at the front will have a table comparing memory, number of gpios, etc. for particulars
bit twiddling: http://graphics.stanford.edu/~seander/bithacks.html
interesting courses: https://pikuma.com/courses
is power profiler kit specific to each board necessary, e.g. nordic, stm32?
software-blogs: https://www.gingerbill.org/article/ https://www.rfleury.com/
https://linuxafterdark.net/ podcast
embedded-blogs: https://patternsinthemachine.net/ https://blog.feabhas.com/ https://blog.st.com/ https://dmitry.gr/?r=05.Projects https://tratt.net/laurie/blog/ http://stevehanov.ca/blog/ https://thephd.dev/ https://www.embeddedrelated.com/blogs.php https://lemon.rip/ https://jpieper.com/ https://www.embeddedrelated.com/ https://patternsinthemachine.net/category/general/ https://embeddeduse.com/ https://martinfowler.com/articles/patterns-of-distributed-systems/?mc_cid=3835da293a&mc_eid=UNIQID
memfault
https://embeddedartistry.com/fieldatlas/embedded-software-development-maturity-model/?mc_cid=da33e3796b&mc_eid=UNIQID
Seems that IAR compiler produces smaller, faster code than gcc?
Fallback to: https://grep.app/search when searching for code snippets on github?
UART is protocol for sending/recieving bits. RS232 specifies voltage levels
DSP: https://www.youtube.com/playlist?list=PLTNEB0-EzPluXh0d_5zRprbgRfgkrYxfO
Wifi/Ethernet: https://www.youtube.com/watch?v=dumqa78j1sg&t=1046s
Lora/Nrf/IoT connections: https://www.youtube.com/watch?v=mB7LsiscM78
courses: https://www.youtube.com/watch?v=dnfuNT1dPiM&t=25s https://www.udemy.com/course/mastering-microcontroller-with-peripheral-driver-development/?ranMID=39197&ranEAID=YuKpx7UHSEk&ranSiteID=YuKpx7UHSEk-VcxzUKL7wAR1VyB2muQaaQ&LSNPUBID=YuKpx7UHSEk&utm_source=aff-campaign&utm_medium=udemyads https://www.udemy.com/course/microcontroller-dma-programming-fundamentals-to-advanced/?ranMID=39197&ranEAID=YuKpx7UHSEk&ranSiteID=YuKpx7UHSEk-85NcF8rkTsNIoMCfETBU3g&LSNPUBID=YuKpx7UHSEk&utm_source=aff-campaign&utm_medium=udemyads https://www.udemy.com/course/mastering-rtos-hands-on-with-freertos-arduino-and-stm32fx/?ranMID=39197&ranEAID=YuKpx7UHSEk&ranSiteID=YuKpx7UHSEk-qjeQee0Iel.PZ6z63nXsmw&LSNPUBID=YuKpx7UHSEk&utm_source=aff-campaign&utm_medium=udemyads
memory align up: sbrk((x + 7) & ~7); sbrk() is system call malloc uses?
silicon mainly obtained from quartz. electric arc when insulator air is supplied enough energy to ionise
(HSPI is high speed parallel interface)
astable means no stable states, i.e. is not predominatley low or high, e.g. square wave, sine wave etc. oscillator generates wave (could be for carrier or clock)
RC (resistor-capacitor) oscillator generates sine wave by charging and discharging periodically (555 astable timer) internal mcu oscillators typically RC, so subject to frequency variability
Max out the HCLK in the clock diagram as we are not running off battery. Will have clock sources, e.g. HSI, HSE, PLL. output of these is SYSCLK. SYSCLK is what would use to calculate cpu instruction cycles.
a clock is an oscillator with a counter that records number of cycles since being initialised
Crystal generates stable frequency PLL is type of clock circuit that allows for high frequency, reliable clock generation (setup also affords easy clock duplication and clock manipulation) So, PLL system could have RC or crystal input Feeding into it is a reference input (typically a crystal oscillator) which goes into a voltage controlled oscillator to output frequency The feedback of the output frequency into the initial phase detector can be changed Adding dividers/pre-scalers into this circuit allows to get programmable voltage. So, a combination of stable crystal (however generate relatively slow signal, e.g. 100MHz) and high frequency RC oscillators (a type of VCO; voltage controlled oscillator)
openocd -f /usr/share/openocd/scripts/interface/jlink.cfg -f /usr/share/openocd/scripts/target/stm32f4x.cfg should open a tcp port on 3333 for gdb
The CPU architecture will have an exception (a cpu interrupt) model. Here, reset behaviour will be defined. the 32 bit arm cortex-m4 has FPU (a application, m for microcontroller, r high performance real time) TODO(Ryan): avr vs arm vs rsic-v vs x86 vs powerpc vs sparc vs mips (what motivations brought about these architectures?) as often harvard archicture (why?) Von Neumann, RAM (variables, data, stack) + ROM (code/constants) + I/O all on same CPU bus. Harvard has ICode bus for ROM, and a SystemBus for RAM + I/O. This allows operations to occur simultaneously. So, why use Von Neumann? it is labelled as an evaluation board different boards use different ICDI (in-circuit debug interfaces) to flash through SWD via usb-b e.g. texas instruments use stellaris, stm32 ST-link is fixed point used anymore? TODO(Ryan): Why is a floating pin also called high impedance? To avoid power dissipation and unknown state, drive with external source, e.g. ground or voltage.
Pull-up/down resistors are to used for unconnected input pins to avoid floating state So, a pull-down will have the pin (when in an unconnected state) to ground, i.e. 0V when switch is not on
IMPORTANT: Although enabling internal resistors, must look at board schematic as external resistors might overrule
Vdd (drain, power supply to chip) Vcc (collector, subsection of chip, i.e. supply voltage of circuit) Vss (sink, ground) Vee (emitter, ground)
the sparsity of linux can make configuration vary e.g bluez stack -> modify policies
for long range, LoRa or sigfox essentially tradeoffs between power and data rate ieee 802.11 group for WLANs (wifi - high data rate), 802.15 for WPANs; 802.15.1 (bluetooth - le variant - heavily used in audio), 802.15.4 low data rate (zigbee implements this standard)
unlike windows msdn, linux documentation is mostly source code (not good as if not fast/easy, not used). so, essential to have something like ctags and compile from source (sed -e “s/-Werror//g” -i *.make) source and unit tests are documentation in many linux sources
xor with itself sets to 0
“1’s and 0’s” + matej youtube channels
https://embeddedartistry.com/blog/2018/06/04/demystifying-microcontroller-gpio-settings/?mc_cid=c443ecbc14&mc_eid=UNIQID https://embeddedartistry.com/blog/2019/04/08/a-general-overview-of-what-happens-before-main/?mc_cid=c443ecbc14&mc_eid=UNIQID
unit test notes from udemy
Aardvark adapter essential for automated testing (so, an adapter of sorts should always be used for automated testing?)
COMMAND PRESENTATIONAL RESPECT
SELL TO EXISTING CUSTOMERS
42:00 https://www.youtube.com/watch?v=rRzM7MkppEo
Branding/idea and website are essential
find products from amazon, etsy, ebay, etc. however, only sell with aliexpress as offers dropshipping methods, i.e. no packaging alibaba is wholesale not dropshipping, i.e. must buy in bulk
paid ads on facebook, instagram, tiktok
for supplier, can get faster than aliexpress however cost more
add x3-x5 markup on product to pay for ever increasing advertising cost
(buying in bulk will get faster shipping times) manually -> DSers app to automate purchases -> sourcing agent be upfront with customers regarding shipping time at the start (we are just testing, i.e. consistent sales and proof of demand at the start, so long shipping times are fine. just mention in FAQ or shipping policy etc.) (get customisable product later, i.e. branded box as will have to pay in bulk on Alibaba for them to do this)
Handling Returns: 1. damaged: get user to send photo, have them keep the product, get supplier to ship another 2. ok: ask user to pay for shipping costs back to you. then refund them
TikTok film yourself using a product, see if video can go viral (whole skill in itself)
ads spend $50 per day. around $150-$200 to test a product until find a winner
front-loaded time investment. once found good product, 30-60min per day of ads (ads manager, metrics, etc.) 2-3 hours a day overall spend daily time on product research (DEVELOP SKILLS TO FIND A PRODUCT THAT IS WANTED AS A GOOD PRODUCT IS KEY), copyrighting, competitors, etc.
niche store
targeting impulse buyers, not value shoppers (so, sell under $100) show innovative product with wow factor, they buy on the spot
if product out of stock, refund and send them a 10% discount code
digital age allows for microexperiments to test market, instead of wasting capital on something that could fail (i.e. put advertising into market first to see how customers react)
Test market We are only concerned with determing purchase intent. Customer experience will initially be poor due to long shipping times. Aliexpress -> CJDropshipping, Spocket, Zendrop, uDroppy, WiiO
Sourcing agent (when have 10-20 orders a day) Can reduce pricing, faster shipping, custom packaging, thank you notes, etc.
Local fulfillment bulk buying and shipping to a 3PL (3rd party logisitics)
does it target niche customer (write down target audience and their problems you are solving) get product that solves problem or adds value to their life in a meaningful way product must have a unique selling point high perceived value or problem solver, e.g. posture corrector -> solves problem vegetable slicer -> adds value can’t be bought in stores (not commonly advertised, not a basic product in store) (no one will buy it from an unknown store if not unique) targeted to customer niche lightweight and easy to ship possible to add markup pick something that you have expertise in, allowing for easier market research - health (back pain, sun-burn); pet supplies needs to have WOW/UNIQUE-factor to GRAB ATTENTION IN AD
Have 5-10 products on website to build trust. Only put ads in for one HERO PRODUCT at a time
DAILY RESEARCH TO SELECT PRODUCTS: TIME CONSUMING PROCESS TO UNDERSTAND MARKET REGULAR ACTIVITY, NO OFF SWITCH - manual (see what products are selling well on websites); Google Trends to verify search volume for product - social media ads (follow big theme instagram pages, initiate checkout to get ad algorithm to follow) - (START WITH THIS: spy tools (combine previous 2)) - IMPORTANT: The only way to know if will work is to test with your own ads. These are just indicators - identify how much competitor selling for by reverse searching image As gain experience, will see products over and over again, i.e. product saturation (Google Trends can show this)
Make website fit your brand before adding ads
Make sure domain is available before setting site name (short word namelix.com) Also ensure domain is verified and web events prioritised before ads?
Tracking essential before running ads, so as to better optimise ads Facebook/tiktok pixels tracking code on website installable through shopify app Google analytics as well
(Zoho) info@storename.com
Shopify payments (credit cards) and paypal (business account)
Shopify apps: * Fulfillment app: DSers -> aliexpress (CJDropshipping, S-pocket, weho, zendrop all have there own apps) * Aftership (helps customers track order) * Klaviyo/SMSBump marketing channels (email/text, e.g. abandoned cart emails etc.) * Vitals (Product Reviews, Volume Discounts/Product Bundles (Upsells), Currency Converter, Visitor Replays, Wheel of Fortune, Frequently Bought Together, Related Products and Product Description tabs.) (want to save purchase information to allow one-click purchasing if returning to increase LTV) (offer discount in post-purchase emails) (GO INTO SHOPIFY AND MODIFY CONFIRMATION EMAIL TO INCLUDE 24HR DISCOUNT CONFIRMATION EMAILS HAVE 100% OPEN-RATE)
Put shipping times at bottom of product page, shipping policies and FAQs
Essential pages: Home Page, About Us, Product Pages, FAQs, Legal Policies and Contact Us add a Shipping Policy and Track Your Order page via the Aftership app
Conversion rate optimisation (if not getting sales, one of these needs improving): * Product Photos (5-10 photos; flat-lays on whitebackground and lifestyle someone using) initially source from Aliexpress, alibaba, amazon (later down the track can: take photos yourself, 3D realistic model from fiverr/upwork, get professional photos etc.) * Pricing (Compare at Price on Shopify for discount offering?) (Test free shipping. Perhaps best to offer free shipping over $X to increase AOV) * Product Description Hero lines describing one core benefit (i.e. how improve’s life) then 5 bullet-points describing more features (i.e. what the product does) MAKE SURE BENEFIT IS TANGIBLE/SPECIFIC AND METRIC BASED IF POSSIBLE, E.G 1000 songs in pocket can look at real reviews and write description on their pain points * Product Reviews (import reviews from Aliexpress via Vitals app and ensure grammar is correct and have both good and bad reviews) (hero product should have the most) * Website Speed (jpgs over pngs (crushpics app), no videos, low number of apps)
Make sure have a 30day money-back garuntee. Ensure customers can contact easily
Anything over 3% good conversion rate
landing page are more suited for digital products (and take a lot longer to make) (can later do with say Shogun, zipify pages etc.)
Human beings are biologically programmed with 8 main desires: Survival, food/drink, freedom from danger, sex, comfortable living conditions, being superior to others, protection of loved ones and social approval. Ads that target these will be subconsciously watched
paid traffic and organic traffic (TikTok with someone using your product. study and recreate competitor videos) AT LEAST $50 A DAY ON ADS. $150-$200 TO TEST A PRODUCT (does not include $50 for ad creative) run ads for 4-7 days to test (if perform terribly on day 1, then probably kill earlier) (have to give time for ad algorithm to optimise to find your buyers) should get sales within 1-3 days after this 4-7 day period, choose to scale up or kill of ad-sets (terrible is 0.5% Link CTR, and $3+ Link CPC maybe only wait 48 hours)
Facebook (includes Instagram) and TikTok (perhaps if targeting younger audience) (tiktok does not require lots of followers to go viral, just good content) Start with one platform, Facebook (king for ecommerce due to data collection and targeted ads) With Facebook, set feedback score to send after 8 weeks to ensure to not being blocked if too many bad reviews
Goodle ads, Snapchat and Pinterest aren’t for impulse buyers (perhaps explore Google ads when doing a branded campaign, e.g. people search for your website)
If doing influencer marketing stick with theme pages as oppose to personal brands, e.g. advertise on an instagram themed page?
Writing a blog and loading it up with keywords can drive ad-free traffic by ranking in google searches. However, long and difficult process
Need to have video ads first? Hire ad creatives on Fiverr #resources section
Ad Metrics: INTEREST: * Link CPC (cost per click) how much it costs for one person to click on your ad? (strive for under $1) * Link CTR how many people click through to see website after ad (strive for over 1-2%) (CPC and CTR correlated) ULTIMATE METRIC (if this profitable after 4-7 days, scale product): * Cost per purchase how much it costs for someone to purchase on website?
For Facebook ads create a business account from main profile. Has to be a real account not to get banned If get banned without doing anything wrong, message and should be resolved 1-2 days * Don’t call out people directly, e.g. ‘People have a problem’ over ‘you have a problem’ * No outlandish claims
Add FB pixel helper extension to browser Before launching, click on pages, add to cart, checkout and see if events are firing a pixel?
Let ad platforms optimise age/gender ad settings, so leave these broad
Go after 1M audience size? Optimise for purchase? (ignore warnings asking to optimise for funnel actions)
Start with (ABO) ad-set budget optimisation. Later might do CBO (campaign budget optimisation)
1%, 3% etc. LLA (look-alike-audiences) target for people who have already made purchases (best to do when say 300-500 events) (involves creating a custom audience and targeting for them specifically?)
Improve CTR by testing new ad hooks (first 3 seconds to capture attention. rest should keep attention till end. should be replayable. copy competitors) Add more call to actions in the ad copy, i.e. add more links If have successul ad-sets, modify the hooks of them an rerelease
CPM (cost per 1000 impressions) is cost of ads, which is largely out of your control. if you have better ads, i.e. more shares and views, the cost of ads will be cheaper
Not getting sales, look at funnel: 1. Ad metrics (are people sharing, etc.) 2. Website metrics (where’s the drop-off in customer activity) 3. New product
Consider changing creatives or extend ad running time if breaking even or slightly profitable, otherwise move on. don’t get attached to products (a winning product is waiting out there)
Make-A-Video AI seems promising
Good news first ever drug to slow down cognitive decline of Alzheimer’s. Also drug to slow down ALS. Perhaps prolonging drugs first step?
DreamFusion 2D-to-3D looks promising
Will future cars will be synced with mobile OSs from Apple or Android? Already seeing start of this with CCC (Car Connectivity Consortium) pushing smartphone car unlocking
Unfortunately malware spreads through low-hanging CVEs or social engineering
Memory market collapse, i.e. a type of chip market
How nice that Apple’s anti-tracking crackdown only applies to third-party apps
With the growth of hardware, really seems that adaptive learning algorithms are going to be used instead of solving the problem explicitly Perhaps AI solving the most-optimal implemention of an algorithm is more appetising for me
More companies branching into GPUs with AI functionality, e.g. Intel Arc, Acer etc. Furthermore, more companies branching into VR, e.g. Lenovo, Facebook etc.
Is a smart ring really any better than a smart watch?
As you would expect, Raptor Lake CPUs faster single thread speed at a lower wattage
Will it be common place for impaired actors like Bruce Willis to sell likeness for deepfakes? In China, using virtual influencers.
Again, YouTube offers another thorn; restricting 4K access to subscribers
MCUs that support OTA firmware updates will typically have built-in key storage
Although decentralisation sounds good (own cell network etc.), it requires user maintenance which people will pay others to do and we’re back to square one in a way
How does Google Tensor G2 chip with various CPU architectures, e.g. Cortex-A78, Cortex-A55, Cortex-X1 work?
OS’s designed for wearables, e.g. WearOS for Android. I suppose wearables have become an established device class target?
It really isn’t tinfoil hat mentality to be wary of updating unless necessary, e.g. most recent kernel patch affected Intel graphics displays
Floating pod homes in Panama no one asked for.
In a literal sense Moore’s law is dead, however chiplets pose interesting alternatives
Character.ai chatbot creator. In fact, AI editor/generators for most artforms seems to exist (video runwayml)
The fluidity of OSs continues with new Ubuntu 22.04 replacing PulseAudio with PipeWire, X11 with Wayland.
Having a live session USB is essential to always give root access to filesystem
Can now get Ubuntu Pro for free, which is just extends LTS to 10 years
Fast GUI file searching with fsearch
Xbox streaming game console. High network speeds seems to make cloud gaming more affordable. However, don’t like the idea of constant network connectivity and the power of the provider to shut you out.
15th Century greatest appearance of ‘geniuses’. Despite access to information, genius declined. Seems that require proffesional tutors at young age to instill a human social engagement
Thanks to TCC compiler, can compile C in memory and load it, hence using it in some way like a scripting language However, this can create serious security holes. Could still use in a sandboxed process, e.g. with libseccomp WASM allows to run a subset of C++/C (and well anything that compiles to webassembly) in browser as a sandbox. Could use with wasmer library Indeed, with WASM, can compile a native library and use it on the browser like SQLite3
With the steady proliferation of VR gaming, it’s a good thing I was not young during this time
DynamicPixelTuning (DPT) promises to make every pixel capable of outputting all colours. Therefore, get 3 times resolution than having combined rgb pixels
Very helpful ranger preview in conjunction with rc.conf
and scope.sh
Apple more like a bank with savings account for Apple Card
Interesting new Danish political party with decisions made by AI. I think AI as a collaborator is promising
Despite new phones offering a plethora of new features, don’t assume that they are bug-free, e.g. pixel phone crash detection malfunctioning on roller coaster, not allowing dialling to 000, etc.
Realise you can overclock RAM with Intel XMP (extreme memory profile)
Although you hear new Apple and Microsoft chips, almost certainly ARM under the hood
Intel NUC (next-unit-computing) just marketing term for small form factors
Although largely unecessary now, Nvidia has lite hash rate which impedes peformance in order to get GPUs into hands of gamers
The decline of the lone app, as Microsoft 365 leviathan
I see AR being more useful than VR, e.g. hololens for soldiers
Meta headset will use eye-tracking to position ads
Perhaps an emeritus professor I’m more willing to listen to.
What I used to take as prima facie from news outlets, now no longer
Unfortantely UNSW basketball CC’d not BCC’d
To no one’s surprise, Windows updates nerf ryzen performance
Marketing plug for cloud gaming consoles when in fact standard phones can do the same thing
With improved network performance, perhaps cheaper to offload calculations to powerful servers (VDI: virtual desktop infrastructure)
NB-IoT (narrowband, i.e. low power). Also have CoAP (constrained application protocol) for embedded devices to access Internet
Easy to fall prey to the act of not adding anything useful to a product, but simply adding IoT and calling it smart, e.g. smart condom
HDR (high dynamic range) refers to colour spectrum. Implemented in technologies such as Dobly Vision Dolby Atmos is a surround sound technology
RCS seems to be better than SMS
Not really decentralisation with cloud, as many services just operated by a handful of large companies (not what DARPA intended!) Cloud is really just reduced complexity. It excels when application simple and low traffic (managing a large application in the cloud is just as difficult as on bare metal) Or, your traffic patterns are unpredictable Otherwise, paying an unjustified premium Sold as computing on demand (no complexity) when in reality, just renting computers at a higher premium Let your talents to your own machines, rather than Amazon or Google
Will NFC for door locks take hold?
AI for vaccine development sounds promising
Are the petabit speeds of research optical chip really that useful if hardware can’t process that fast? Will radio only get us so far?
Probably better to use url-shortener service for sharing links
Perhaps explainpaper.com could break into reading programming papers
More firm on not using Apple, e.g. Apple not allowing other app-stores, taking revenue percentage of adds on apps, etc.
Bioengineered plants seem promising in say absorbing air pollutants
Amazon continue to kibosh any semblance of non-monolopy now getting into home insurance
Raptor Lake overclock to over 8GHz (however with liquid nitrogen…)
With development of cheaper more powerful hardware, companies introducing more ‘gaming’ brands e.g. Phillips new gaming monitor
Another cautionary tale to not update with Apple update nerfing ANC earphones
Xcode cloud subscription. oh no wasting so much time on naming of new processes, e.g. ‘Developer Experience Infrastructure’
Another billionaire investing in a ‘utopian cities’ seems more dystopic than anything. Promulgation of venture capitalists, angel investors, etc.
PIC have weird instruction set, so generally have to use with assembly over C compiler (no good free compiler) So, use AVR for low-end like just LED driving? e.g. TLC5971
time-of-flight sensor like an infrared radar
over-current and over-voltage detectors for when using charging devices, e.g USB-C charger? so, perhaps investigate/understand voltage regulators? also have UVLO (undervoltage-lockout)
new USB-PD (power delivery) interface more power from USB-C
I can see machine learning ‘appropriate’ in say cleaning up random noise, e.g. brain signals
interesting set of questions to inspect a software engineering workplace: https://neverworkintheory.org/2022/08/30/software-engineering-research-questions.html?utm_source=tldrnewsletter
things like database accelerator library indicative of normal software not being fast
MOSFETs are a type of transistor. different transistors for say quick-switching, low signal, high frequency, amplifier etc.
USB-4, PCI5, DDR5 emerging standards.
Could buy a GPU for running interesting machine learning applications like Stable Diffusion (prompt engineers, oh dear…) https://www.krea.ai/ GPU structure similar to CPUs, e.g have cache, GDDR6 memory (more simple parrellisation)
float-toy nice web visualistion tool
interestingly CRT can scale better than fixed-set resolution LCD
new TV monitor combination QD-OLED curved monitors less fatiguing as physically matches our eye’s shape flexible/bendable monitors no one asked for…
cool looking completely submerged server desktops. pure water is a very good insulator (our tap water will have chlorine for example) obtained via ozone treatment
the Nintendo DSi implemented augmented reality virtual reality is completely immersive
how does wireless charging work without a pad, i.e. no induction charging? interesting power sharing from phone to watch
trie -> gzip. still have to decode and resulting memory same as before compression succinct data structures are designed to not have to be decoded, i.e. everything is stored as bits. ∴ uses a lot less memory (however, only for larger datasets)
compound literals issues with debugger and maintanence
C11 static assertions, how to use? performance improvements? static const char sound_signature[] = { #embed <sdk/jump.wav> }; static_assert((sizeof(sound_signature) / sizeof(*sound_signature)) >= 4, “There should be at least 4 elements in this array.”); anonymous structs int fnc(int arr[static 2])
C23 can prevent VLA by defining constexpr true, false keywords nullptr (it’s a pointer not number) enum e : unsigned short { x }; digit separators function attributes
smarter devices not necessarily better, e.g. printer’s with end-of-life software
lol, ferrocene is a developing rust toolchain
PLC more expensive microcontroller that is more versatile, e.g. handle voltage overload (often used in assembly lines)
removing shared pointers probably gives later code in the pipeline a small cache boost as all the data is now co-located excessive logging cause for bad performance excessive heap usage bad for performance if on an OS note there is static, stack and heap memory areas
lol, killedbygoogle.com. perhaps not wise to rely on new services
is self-administration the only benefit of an inhaled COVID vaccine?
doesn’t seem like quantum computers will be able to solve any practical tasks (unless program exploits quantum parrelism to a large extent)
cloud computing with containers growing, e.g. AWS/Azure ➞ Kubernetes ➞ docker
it’s sad, but you really could do a stand-up of modern software projects, e.g. “introducing Goliath, an automatic external dependency manager. under the hood we use a Nextrus package manager. can be scripted with Freasy language extension of Frosty core language”
seems a lot of phones adding satellite connectivity even though it’s much slower
although space travel seems like the future, how to cope with serious health affects like space radiation
opening up web pages from tech news sites just awful. inudated with floating content-blocking video ads, permanent marquee ads embedded beween paragraphs and browser title bar flashing with bot message notification, bot message popup, cookies accept bar, …
asciinema useful terminal recording tool with website to host
potential for sim-locking being a thing of the past with a push to eSIM cards
New AI that can edit videos with textual prompts The dramatic shift in technology that mobile and cloud devices have brought is being realised by natural language processing As text is seen as a universal input in a lot of Unix programs, interesting possibilities.
eyes convert photons to electrochemical signal transduction converts one energy form to another CCD (charged couple device; less noise, more power, lower speed) and CMOS (consumer grade) common camera sensors digital cameras convert photons to array of pixels, represented as voltage levels quad pixel camera sensor combines four adjacent pixels in this array
round design of new Nvidia GPU may be evidence for the 20-year cyclical nature of fashion
amazing new 6GHz stock speed Intel CPU
although all crypto is ponzi scheme, it seems the goals of Ethereum to perform secure financial transactions is headed in a better direction than BitCoin. furthermore following The Merge, it does not rely on power hungry mining. this is in term has led to a lot of 2nd-hand GPUs flooding the market
avx512 cant actually get performance stated by Intel ‘marketing’ as it causes heat up and cpu thermal throttles
good to see some work being done on the interopability of digital wallets
in general, Occum’s razor approach to muscular issues
smart power homes whereby the source of the power can be discerned, i.e. if it’s clean or not. extends to phone chargers with knowledge of this
interesting Github Copilot Labs able to do rough translations between languages
good to see (in some ways) hardware vendors pushing for AI standards to allow for greater optimisation
ATM machines can be targeted for card-skimmers
possibility of sprite animation in terminal using chafa tool to create ascii block art
Intel new naming scheme confusing using brand name as category, i.e. ‘Intel Processor’ instead of ‘Pentium’
YouTube no respect for customers, running clandenstine experiment running up to 5 ads at the start instead of spacing them out Furthermore, Mozilla researchers found that buttons like ‘Stop recommending’, ‘Dislike’ have next to no effect
Growing space economy with NASA ISS becoming privatised
Gaming phones with extended fans/cooling and amenable gripping features
I suppose it would be kind of cool to travel in a plane going Mach 1
There can be such extremes in software, e.g. from programming hardware in FPGA to writing natural-language descriptions for an AI to convert to code
Cool 3D printing pens, although slow, can be used to say repair a chipped brick
Cool USB SAMD boards (possible malware creation)
Amazing creations from AI from still images, videos, digital assistants: https://threadreaderapp.com/thread/1572210382944538624.html Companies like DeepMind, OpenAI chatbot AI used in UI testing GPU DLSS (deep learning super sampling) is AI upscaling
Finally, alleviating ambigious USB 4.0 v2.0 naming scheme with devices having clear USB 40Gbps, 240W printed on them
Compelled to investigate a service like paperspace in order to run OpenAI and other AI projects
Promising Framework laptop build to easily repair in mind. However, subpar experience
Teachers are fired for moonlighting
Record breaking DDos attacks, 17.2million requests per second, 3.4terabits per second, 809million packets per second
Google have size to challenge Dolby with new HD audio standard. However, whilst seeming altruistic, is just so they don’t have to pay Dolby licensing fees in their hardware
Interesting Domain Brokerage services to allow you to get already used domains
Amount limiting will not work to prevent MFA fatigue attacks
Record-breaking figures acheived with overclocking can be decieving as may employ high-end coolers or even liquid nitrogen
Still Windows updates break things, e.g. NVidia drivers
The inundation of Javascript web frameworks has provided a learning point for adopting new technologies. In general, stick to familiar technologies and only adopted bleeding edge later
Cool application of IoT: https://hackaday.com/2015/11/24/building-the-infinite-matrix-of-tamagotchis/
Although lower-res, CDs aren’t lossy compressed like Bluetooth or spotify.
Amazing gigabit speeds being sold for ISPs
Sad that bariatrics is even a field of medicine
Serverless is just a term for a caching server closer to clients
In the same unrequested vein as foldable PCs, now have slidable PCs
Perhaps the going through graphical ASMR programming videos the ‘enjoyable’ remedy
More encompassing/combined sensors, e.g. AI vision sensor, gesture detection sensor
Increased power of technology, developing perhaps faster interfaces, e.g. search by photo, speech, etc.
sextempber cornucopia of condoms don’t have to be nostradamus to work out parents parking in effect, most people’s journey to learning is bespoke my engineering mind (which has been molded by experience) results in largely ad hoc responses many positive social dealings at university are unfortunately pyrrhic
I believe decoding brain waves into functional output is major change in my future
Unikernel has applications bundled inside kernel (so like eBPF) for high performance
Amusing Dead Internet Theory that the Internet died around 2016 and is now largely bots
Oh dear, javascript smart watch
Interesting 3D print to make diffusing sheet for say LEDs to create different ambiences
AI now being used to improve compression algorithms
Amazing potential of ‘molecular computers’ to make drugs with precise traits (could extend to human’s gaining amazing a priori skills)
JPEG XL better compression and lossy + lossless mode
Again, Microsoft updates not working. This time, not actually properly applying vulnerability update
Interesting can just buy off the shelf drone and attach something like an ESP32 to send data back to us
Cool program that sort of gamifies video conferencing allowing for individual chats: gather.town
Like ‘Scrum Certification’ now with Matter have an official certification process
Good work being done in legislating IoT labels, e.g. like nutrition labels that give information sensor data collected by device
Have GPU microarchitectures like RNDA.
Chiplets seem future of shrinking size and expanding computing power, now seen in GPUs They can be easily be recombined to create custom designs
VSR (virtual super technology) has GPU scale to 4K and then downscale if necessary to monitor’s resolution However, if want say 8K gaming, will require a DisplayPort cable
Robots beginning to see more everyday use, e.g. waiters. Also, have robots learn in the field, e.g. Texas University have robots walking around campus
Resistive RAM uses analog memory cells to store more information with less energy
GPT-4 upcoming text generation AI Perhaps text-to-speech, language transcription too diverse to solve without AI, e.g. deepgram program
Although some people experience myocarditis after COVID vaccine, myocarditis long been linked to a number of viral infections
Borg like spacecraft, i.e. cubesat now in orbit
Silicon carbide power supplies more efficient
AI requires high quality data, i.e. created by skilled individuals. Also requires unique data. Might not have enough unique data in the future
Unfortunate that Intel releasing software-defined-silicon, i.e. pay-as-you-go to enable certain hardware features
Not even subtle that ads are no longer targeted for you, but rather shills
Prompt engineers for many AI generators, e.g. text-davinci-3
GPU AI DLSS improve performance by offloading frame generation
Biodegradable sensors made out of organic circuitry proof of concept
Watching me sleep on the floor is watching how the sausage is made in regards to my posture
Serendipitously obtain a bokeh photo with my outdated phone camera
Politics are infiltrating areas of technology: * Rust toolchains on embedded. Rust developers have explicitly tweeted saying technology will always be political * NASA Artemis Accords have first line stating primary focus on women and person of colour on Moon Leads down the path of FSF, and in fact any cultural revolution which thinks it can do whatever it wants in the name of the people. Although perhaps ad hominem, experience has shown that they’re all polite until they’re not.
Seems in with big-tech, marketing is often more important than the underlying technology. 170km ‘The Line’ ecotopia, metaverse, telsa etc. They make wild claims and the general public has no way of knowing the facts behind them (perception vs reality) A scary thought is that in this age, possible for self-sustaining narratives capable of deflecting facts
ARMs open-model allowed vendors to implement custom MPUs that saw in gain dominance over oher RISCs like MIPS and AVR.
Tech companies becoming conglomerate monopolies, e.g. tiktok music, apple tv etc.
Much like the C++ standards committee, concerns of ‘ivory tower’ nature of smart home Matter standards
Seeing consequences of covid semiconductor demand in chip shortages. Exacerbated by increase usage in automotive industry and large dependence on foundries in Taiwan
Annoyed at the web: * Seemingly lack of awareness of bloated and abstracted infrastructure * New technologies in the sphere are just sensationalised titles with little substance e.g. homescreen social media, css layout model and js frameworks
The importance of programming to a physical machine is paramount. The underlying technology is always changing, e.g. arcade machines, consoles, phones, watches, plastic 4bit processors
EU lawmakers want USB-C for all mobile devices to reduce e-waste. Makes me postulate a technocracy.
Time-of-flight sensors can be used to detect water levels
EMV (europay, mastercard and visa) secure payment technology embedded in credit card chip
Waited till USB-C becoming standard to introduce reversible USB-A (achieved with movable plastic divider and duplicated pins) Intel thunderbolt faster than USB-C, yet ports still look very similar
AI in sensors used to alter configuration to optimise power consumption AI generated voice, text. Disney can now alter age AI parsing of voice (natural language processing) In summary, generative AI everywhere. In fact, with ChatGPT being able to explain technical concepts, birth of AGI (artificial general intelligence) Perhaps this could be used as a sort of offline search engine. In fact, ChatGPT generate prompt for DALLE Whilst ChatGPT solves problems considering computers as generalised machines, it seems eventually it will get there. So, embedded probably the last the be tackled due to unique systems
Genetic engineering in flora seems more appetising, e.g. drought-resistance wheat, air purifing plants
Batcat tool is cat with syntax highlighting
Skeptical of announcements made by budget-starved laboratories (e.g. universities) about breakthroughs for technologies decades away, e.g. fusion There are often caveats and furthermore, commerciality is most likely decades away
Seems that bipartisan government action required to fix rats nest of drivers in modern OSs in a similar vein to EU enforcing anti-competitive laws on Apple to allow third-party apps, USB-C etc.
Have to be careful not to engage in technological contempt culture, e.g. language wars. As technology changes rapidly, address changes with temperance
If social media was all RSS (really simple syndication), things would be much simpler
We are in a ‘gig’ economy, i.e. contract work
Example of Apple silicon is M1 chip. Placing some Apple silicon inside new removable monitors to use less power from attached computer
Optical computing, i.e. analog uses less power excel at linear algebra. Makes them ideal for machine learning
Is prompt engineering now an important skill, as oppose to a sad state of affairs?
Would controlling the weather be nice? e.g. releasing sulfur to reflect more sunlight
Medical developments not just better treatments but also easier detection, e.g. Alzheimer via blood test Even still, immune diseases like rheumatoid arthritis are more easily treated with drugs than osteoarthritis
Home appliances with WiFi connectivity allow for remote updates, e.g. LG dishwasher
Seems all phones will be satellite based in the future
Mass production to TMSC advanced 3-nm chip underway Unfortunately most likely due to Samsung chips having low yield, they don’t have high QA for voltage regulation as compared to TSMC
Interesting ‘vivovore’ found that is an organism that eats viruses
Could the US China chip ban result in China becoming more self-reliant in the future as it’s forced to invest in own chip production?
Further proliferation of OSs, e.g. Touch Ubuntu, Edubuntu, Kubuntu etc.
Google wants RISC-V as tier-1 architecture for Android. Part of further push for open source usage so as to not pay licensing fees, e.g. developing their own video codec etc.
Death of narrator with AI text to speech
Nvidia AI on certain GPUs can upscale older blurry videos
AI training on users that will eventually replace them, e.g. Adobe tracking artists workflow, Github copilot, etc. However, AI does not create innovative results.
Interesting storing application state in base64 url. No server required, and browser history becomes undo-redo
Thread is new low-power protocol for Matter (and therefore IoT devices, i.e. mesh network). Similar to Zigbee and Z-Wave
hierarchy:topology p2p:(mesh, bus) client-server:star
Advent of more software in cars has led to subscription based services, e.g. heated seats. However, adding software increases complexity: * Will lack of connectivity (e.g., no cell coverage where you’re at) mean that the features are disabled until you get back into cell range? * Can the servers be DOS’d so that nobody’s seat heaters work? * If I pay for a subscription and sell a car, does the subscription stay with the car? Or will it be like Tesla’s approach, where the new owner has to pay to unlock the software features, even if the previous owner paid? * What if there is a bug on the server that incorrectly reports my subscription status? Will I be refunded, fully or partially? * What happens when you can’t get in touch with customer support because your subscription isn’t being properly detected on the hardware? * What happens if the hardware breaks, but you’ve paid for the subscription? Is repair to heated seats covered under terms of the subscription, or will that be pushed to owners? * What happens if this strategy is used by a smaller company than BMW, who suddenly goes out of business and bricks your otherwise perfectly working hardware due to shutting down servers?
Are big-tech companies realising the flaws in their ‘Simplicity Sprints’ culture? Slowing hiring, realising they have way too many employees
BCI (brain computer interfaces) are a real future
Unfortunate that ‘true security’ makes things more complex and inconvenient, e.g. yubikey
Physics head-scratcher: dark matter makes up 80% of universe, yet we can’t detect it?
Meta releasing AI chatbot to wild has again resulted in racist and sexist comments
More evidence of contrasting quality between modern hardware and software with worldwide Google outage caused by software update
Oncall software engineers akin to casual teachers
What are FPGAs and how do they allow the creation of hardware for emulating old games like GBA? EU enforcing USB-C forces Apple to switch to USB-C. Yay!
Oh dear, Java is not dying out (state of the octoverse)
Realise I’m highly misanthropic Frequent STMicroelectronics newsletter coverage of IoT (and the many, many protocols) and machine learning indicative of trend in industry
Hi-fi audio. Newer terms to mean higher fidelity/data resolution new OLED TVs (contrast, blacks). QLED/QNED (brightness) is a adding a ‘quantom dot’ layer into the white LED backlight LCD sandwich
5.1 means 5 speakers, 1 subwoofer woofer, subwoofer, speaker, tweeter
Things getting smaller and more data, e.g. ‘normal’ sized VR headsets
I really don’t see fold phones as necessary…
Are we moving down the road of homogenous, e.g. specifically target CPU or GPU or hetereogenous programming, e.g. CUDA
Although a update pathway is provided from 20.04 to 22.04, I’m reluctant to do so due to possible configuration issues. Why I chose LTS 5 year support (2025)
Interesting maglev trains reach amazing speeds and produce minimal noise due to lack of friction
Interesting blood transfusions from older mice to younger mice, the younger mice display characteristics of old mice
Bitcoin is a Ponzi scheme as almost no one actually uses it in transactions, and is purely speculative. Does not create anything. Interesting manifestation of capitalism.
Moore’s law number of transistors doubles every 2 years. Although not strictly true, general trend is holding. Proebsting’s law states that compiler improvements will double program performance every 18 years. Therefore, cautious about the performance benefits a compiler brings. Focusing on programmer productivity is more fruitful In general, newer compilers take longer to compile, but produce slightly faster code maybe 20% faster.
Cyberdecks are evidence that the trend of going smaller isn’t always aesthetic
Growing trend to have workstations operate in the cloud or containerisation. For testing yes, for development however?
Strive towards much more potent nuclear fusion (100million°C) reactors as oppose to nuclear fission (neutron splits Uranium, same as original nuclear bombs)
Amazing that certain old rpm harddrives were susceptible to crashing when ‘Rhythm Nation’ played as the resonant frequency was the same
Streaming outnumbering cable and broadcast TV
ripgrep a much faster and user friendly grep! unar will automatically handle pesky non-folder archives
Interesting to see if mir wayland will take over xlib x11
Although the open nature of RISC-V gives it some economical advantages, historically the ISA has not been the major driving factor in widespread adoption. Rather, who invests the most in R&D, e.g. many places will develop ARM, with RISC-V go on your own.
Security an ever present issue, e.g. every Ubuntu weekly newsletter get a list a security updates
Privacy laws prevent recording keystrokes in app, however can record other information like time between keypresses etc. to identify you, e.g. TikTok
Chiplets connection of chips. So, can build chiplets that aren’t SoC, e.g. just CPUs and SoCs without chiplets Intel R&D into chiplet technology stacking presents it as a future possibility (Apple already uses it with two M1 max chips to M1 ultra)
ACM (Association for Computing Machinery) Turing Award is essentially Nobel Prize for Computer Science. Not applicable for me as awards largely for academic contributions like papers/reports published e.g data abstraction (Liskov substitution, Byzantine fault tolerance), parallel computing (OpenMP standard). In some sense, the modern day Booles and Babbages I’m more concerned with engineering feats in software products.
An unfortunate reality of open tech, AI being used to make paywalls ‘smarter’
Read a Google research project on removing noise in photos. Investigate source to test and am completely put off by the amount of dependencies involved: conda (why not just whole hog and docker), python, jax for TPU (python to tensor processing unit), external repositories This also applied to the ‘amazing’ AI image generator Stable Diffusion (I suppose high VRAM requirements also) Docker has uses in CI
AI for everything dogma is becomming more pervasive with ‘clusters’ to train model. Although Tesla can build a supercomputer to train, like all dogmas, not applicable to everything (readability, debugging goes down)
Even air-gapped computers are not safe from sniffing
Seems that any in-demand tech device subject to bot scalping
As Moore’s law is widening, i.e. was 2 years now 4 years, companies creating own hardware, e.g. YouTube chip to handle transcoding
sudo apt-get update sudo apt-get -y upgrade sudo apt-get -y
install
gcc
make
pkg-config
apt-transport-https
ca-certificates
if ! [ -f /etc/modprobe.d/blacklist-nouveau.conf ]; then echo “nouveau is not blacklisted, doing so and rebooting”
# Blacklist nouveau and rebuild kernel initramfs echo “blacklist nouveau options nouveau modeset=0” >> blacklist-nouveau.conf sudo mv blacklist-nouveau.conf /etc/modprobe.d/blacklist-nouveau.conf sudo update-initramfs -u # NOTE: fter rebooting we need to run this file again sudo reboot fi
if ! [ -f /usr/bin/nvidia-smi ]; then echo “nvidia driver is not installed, installing” # Install NVIDIA Linux toolkit 510.54 wget https://us.download.nvidia.com/XFree86/Linux-x86_64/510.54/NVIDIA-Linux-x86_64-510.54.run chmod +x NVIDIA-Linux-x86_64-510.54.run sudo bash ./NVIDIA-Linux-x86_64-510.54.run rm NVIDIA-Linux-x86_64-510.54.run fi
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600 wget https://developer.download.nvidia.com/compute/cuda/11.3.1/local_installers/cuda-repo-ubuntu2004-11-3-local_11.3.1-465.19.01-1_amd64.deb sudo dpkg -i cuda-repo-ubuntu2004-11-3-local_11.3.1-465.19.01-1_amd64.deb sudo apt-key add /var/cuda-repo-ubuntu2004-11-3-local/7fa2af80.pub sudo apt-get update sudo apt-get -y install cuda
ssh-add -L or simply look in ~/.ssh directory (this is essential for private key)
seems that packaged things in the cloud aren’t all that flexible, e.g. ML-in-a-box cannot have independent components updated
important to run apt update on first running
have to use -O and enclose with “” for wget
annonyingly have to remove open source noveu driver also install cuda not from apt repository
Based on what was trained (LAION 400M internet scraped image-text pairs, which contains violent and sexual images (as oppose to DALLE-2)), output may bias, e.g. nerd might bias towards wearing glasses
A quadratically scaling solution is intuitive However, as we know
every match is unique, linearly scaling solution obtained with a hash
map. C++ STL implementation of hash tables are sets (just keys) and maps
Unordered variants are raw hash maps Ordered use self-balancing
red-black-tree yielding logarithmic time Simplest hashing function
(x >> 4 + 12) & (size - 1)
Important to keep in
mind we are executing on a physical machine and that Big-Oh is a
‘zero-cost abstraction’ world. For example, the extra overhead of
introducing a hashmap (memory allocations/copies) will result in this
being slower for small lists (also no dynamic memory allocations in ISR)
This is why C++ STL uses hybrid introsort
Quadratic insertion/bubble sort preferable for small lists Loglinear divide-and-conquer merge/quick for medium Linear radix for large
In cases space and size parameters different Can join linear operations populate and min/max determination
GE (12): * gens * phys
Free Electives (36): * COMP8001 (elective) * COMP3331 (networks) * COMP9032 (microprocessors)
Disciplinary/major (96): 66 (core) + 30 (electives) Change to default COMPA1 (Computer Science) from COMPS1 (Embedded Systems) * COMP1511 (Programming Fundamentals) * COMP1521 (Computer Systems Fundamentals) * COMP1531 (Software Engineering Fundamentals) * COMP2521 (Data Structures and Algorithms) * MATH1081 (Discrete Mathematics) * MATH1131 (Mathematics 1A) * MATH1231 (Mathematics 1B)
144UoC total: – COMP3900 (Computer Science Project)
Sap is a fluid that transports nutrients throughout tree Gum trees named as sap is gum like, as oppose to say resin like. Will also typically be smoothed bark Eucalyptus type of gum tree Eucalyptus oil droplets from the forests combine with water vapour to scatter short wavelength rays of light that are predominantly blue (ROYGBIV in descending wavelengths)
Australia 5 time zones: Christmas Island (UTC+7), Perth (UTC+8), Adelaide (UTC+10:30), Canberra (UTC+11), Norfolk Island (UTC+12)
Daylight savings incorporates the literal increase in sunlight to timezone So in Spring, the clock springs forward. We lose an hour, i.e 23 hours in that day
In the Fall, the clock falls back. We gain an hour, i.e 25 hours in that day
When using daylight saving time, will be AEDT as oppose to AEST, i.e. different time zones Adelaide, south-east onwards observe daylight savings WA not observed as large part of state is close to tropics, negating effect of tilt of earth
Gregorian calendar does not correspond exactly to one solar year. So, every 4 years Feburary has an extra day (28 to 29) known as a leap year. Also have leap seconds. Gregorian calendar primarily used, however Chinese calendar has each month start on a new lunar phase
Solar eclipse when sun is eclipsed by moon Lunar eclipse when moon is eclipsed by earth
Prime meridian is through Greenwich, England. Longitude is vertical lines, indicating east or west from prime meridian. International Date Line is when you cross over 180° longitude, i.e. from UTC+12 to UTC-12 or vice versa
Latitude is horizontal from equator. Places on the equator have equal time of daylight and night time
Tropics are at 23.5° They are at this amount as this how much they are tilted from the equator North line is cancer, south line is capricorn. The tropics are the region between the tropic lines. They are hot, due to the sunlight they recieve all year round So, tropics are actually hotter than equatorial regions Northern Artic and Southern Antarctic are latitude circles. They represent what areas of sunlight will hit. So in summer, will have 24 hour days, whilst in winter 0hour days.
Solstices mark the start of summer (longest daylight of year) and winter. Equinox is start of spring and autumn (equal number of daylight and night time)
Timestamp is calendar and time of day and UTC (Coordinated Universal Time) offset
Earth rotates eastward, so sun rises in the east no matter what hemisphere.
CPU contains a clock. Each tick marks a step in the fetch-decode-execute cycle. The signal will be sent along the address bus as specified by program counter and instruction or data will be returned along the data bus.
Von-Neumann has instructions and data share address space. The Von-Neumann bottleneck occurs when having multiple fetches in a single instruction, e.g. ldr Harvard has instructions and data with separate address spaces In reality, all CPUs present themselves as Von-Neumann to the user, however for efficiency they are modified Harvard at the hardware level, i.e. pipeline/cache stage. Specifically, will have separate L1 cache for instructions and data. (also have uOP cache considered L0) Therefore, when an architecture is described as Harvard, almost certainly modified Harvard.
Endianness only relevent when interpreting bytes from a cast Can really only see usefulness of Big Endian for say converting string to int Little Endian makes recasting variables of different lengths easier as starting address does not change
Although the instruction set supports 64bits, many CPUs address buses don’t support entire 16 exabytes. As we have no need for 16 exabytes (tera, peta, exa), the physical address size may be 39bits, and virtual size 48bits to save on unused transistors.
Direct-mapped cache has each memory address mapping to a single cache line Lookup is instantenous, however high number of cache misses. Fully-associative cache has each memory address mapping to any cache line. The entire cache has to be searched, however low number of cache misses. Set-associative cache divides cache into fully-associative blocks. An 8-way cache means the number of cache lines in a block. This is best in maintaining a fast lookup speed and low number of cache misses.
Cache sizes will stay relatively small due to the nature of how computers are used. At any point, there is only a small amount of local data the CPU will process next. So, beyond the empirical sweet spot of approximately 64MB, having a larger cache will yield only marginal benefits for cost of SRAM and die-area, i.e. law of diminishing returns.
Check if in L1. If not go check in L2 and mark least recently accessed L1 for move to L2. Bubbles up to L3 until need for memory access which will go to memory controller etc.
Alignment ensures that values don’t straddle cache line boundaries.
CISC gives reduced cache pressure for high-intensive, sustained loops as less instructions required. Instructions will have higher cycle count. Also typically a register-memory architecture, e.g. can add one value in a register and one in memory together (as oppose to load-store)
TDP (thermal design power) maximum amount of heat (measure in watts) at maximum load that is designed to be dissipated (bus sizes of chips smaller, so TDP getting lower) However, value is rather vague as could be measured on over-clock and doesn’t take into account ambient conditions
Clock frequency will often be changed by OS scheduler in idle moments or thermal throttling
Hardware scheduler allows for hyperthreading which is the sharing of execution units. Therefore, hyperthreading not a boon in all situations. AMD refers to this as simultaneous multithreading.
Microarchitecture will affect instruction latency and throughput byway of differening execution and control units. For Intel CPUs, i3-i7 of same generation will have same microarchitecture. Just different cache, hyperthreading, cores, die size etc.
Codec is typically a separate hardware unit that you interact with via a specific API HEVC (H.265; high efficiency video coding) newer version of H.264 VP9 is a open source Google video coding format
When people say vector operations, they mean SIMD. SSE registers are 128bits (4 bytes) XMM, AVX are 256bits (8 bytes) YMM
Average CPU die-size is 100mm². GPU much larger at 500mm² as derives more benefits from more control units, i.e. parallelisation Common transistor size is 7nm. Low as 2nm Silicon atom is 0.2nm Gleaning a Moore’s law transistor graph, see that average CPU a few billion transistors and high end SOCs around 50 billion transistors.
Memory model outlines the rules regarding the visibility of changes to data stored in memory, i.e. rules relating to memory reads and writes A hardware memory model relates to the state of affairs as the processor executes machine code: * Sequential Consistency: Doesn’t allow instruction reordering, so doesn’t maximise hardware speed * x86-TSO (Total Store Order): All processors agree upon the order in which their write queues are written to memory However, when the write queue is flushed is up to the CPU * ARM (most relaxed/weak) Writes propagate to other processors independently, i.e not all update at same time Furthermore, the order of the writes can be reordered Processors are allowed to delay reads until writes later in the instruction stream
A software memory model, e.g a language memory model like C++ will abstract over the specific hardware memory model it’s implemented on. It will provide synchronisation semantics, e.g. atomics, acquire, release, fence etc. These semantics are used to enforce sequentially consistent behaviour when we want it. However, using intrinsics, we can focus only on hardware memory model.
A cache controller implements cache coherency by recording states for each cache line. MESI is baseline used. Has states: * Modified: Only in this cache and dirty from main memory * Exclusive: Only in this cache and clean from main memory * Shared: Clean and shared amongst other caches * Invalid Intel uses MESIF (Forward same as shared except designated responder), while Arm uses MOESI (Owned is modified by possibly in other caches) Cache coherency performance issues are difficult to debug, e.g. one value changed in cache line invalidates it, even though another value in cache line remains unchanged
Anti-trust laws don’t prevent monopolies, they prevent attempts to monopolise by unfair means, e.g. Microsoft browser market, Apple app store etc.
Technically, any digital work created is automatically protected by copyright. So, without a license, people would have to explicitly ask for permission to use
Permissive (MIT, BSD, Apache, zlib) gives users more freedom to say relicense, include closed source software, etc. They generally just enforce attribution Apache like MIT except must state what files you have changed
Weak copyleft (LPGL) applies to files of library not your entire codebase, i.e. must still release your version of the library used So, dynamic linking makes this easier for keeping your source closed If statically linking, must make a few extra steps to ensure the LGPL parts are available, e.g. publish object files
Copyleft (GPL) enforces the developers usage of the code. So, any derivative software must release whole project as GPL, i.e infects your software (and restricts choice of libraries to GPL) Subsequently encounter more licensing restrictions.
Creative commons licenses are composed of various attributes. The default is attribution. They are typically used in artworks, e.g. images, audio files Other elements are optional and can be combined together, e.g no derivative, no financial, must share under same license.
Public domain means no license, so could claim as yours
UUID/GUID (universally/globally) 16 bytes. Typically generated by concatenating bits of MAC address and timestamp
UEFI firmware interface made to standardise interface between OS and firmware for purposes of booting
Called /dev/sd as originally for SCSI (small computer system interface; standards for transferring data between computers and devices) The preceding letter indicates the order in which it’s found, e.g. /dev/sda first found The preceding number indicates the partition number, e.g. and /dev/sda1, /dev/sda2
UEFI use of GPT (GUID partition table) incorporates CRC to create more recoverable boot environment over BIOS MBR (located in first sector of disk) Furthermore, UEFI has more addressable memory in 64-bit mode as oppose to only 16-bit mode Also, UEFI supports networking The ESP (EFI system partition) will have EGI entries that point to a UUID of where to boot one of these will be grub binary like shimx64.efi NOTE: The bootloader is the EFI OS loader and is part of the OS that will load the kernel
ACPI interface to pass information from firmware to OS. This firmware will have hardware information baked into it set by manufacturers
Inodes store file metadata. The metadata stored by an inode is determined by filesystem in use, except for filename which is never stored Typical metadata includes size, permissions, data pointer NOTE: FAT32 won’t store permissions, last modification time, no journaling or soft-links
A filename maps to an inode. Therefore a directory is a mapping of filenames to inodes
A hardlink references an inode, and is therefore impervious to file name change, deletion, etc. A softlink is to a file name
Journaling is the process of regularly writing operations that are to be performed in RAM to disk area of memory known as journal. then apply these changes to disk when necessary this overhead makes them slower, but more robust on crashes as can read journal to ascertain whether certain operations finished performing
FAT for ESP (because FAT simple, open source, low-overhead and supported virtually everywhere) vFAT is driver (typically for FAT32)
EXT4 for system (supports larger file sizes) (NTFS microsoft proprietary) Most filesystems will use a self-balancing tree to index files
Ubuntu distro as compared to debian more user friendly. For example, automatically includes proprietary drivers like WiFi, has PPAs to allow installation of 3rd party applications, and install procedure just works. Also updates more regularly than Debian and provides LTS, so know regular backports provided
Linux is a monolithic kernel, i.e. drivers, file system etc. are all in kernel space. So, more efficient, not as robust to component failure Windows uses a hybrid kernel, moving away from original microkernel due to inefficiencies Using Ubuntu generic kernel (could also have -lowlatency etc.) to not include a lot of modules in kernel to free up RAM usage
Xfce as default Ubuntu GNOME had bug with multiple keyboards. Furthermore, Xfce codebase was readable when inspecting X11 code. In addition, Xfce automatically provided GUI shortcut creation
.deb and .rpm are binary packages. Annoyances arise due to specifying the specific library dependency for each distro version Flatpaks and Snaps are containerised applications that include the specific libraries and runtimes AppImages combine the ‘shared’ libraries and runtimes of Flatpaks and Snaps into a giant file. This file can be copied and run on any distro If packages is being actively maintained, preferable to use .deb as faster and simpler
Linux DRM (direct rendering manager) -> X11 (display server) ->
xfce (desktop environment)
Linux ALSA (advanced linux sound architecture) -> pulseaudio (sound
server)
Fstab (File System Table) describes filesystems that should be mounted, when they should be mounted and with what options.
SystemV ABI: rdi, rsi, rdx, rcx, r8, r9 (6 integer arguments) xmm0, xmm1, xmm2, xmm3, xmm4, xmm5, xmm6 (7 floating arguments) Remaining arguments pushed right-to-left on stack rax return and syscall number Stack 16-byte aligned before function call SSE2 is baseline for x86-64, so make efficient for __m128
A premptive scheduler will swap processes based on specific criteria.
Round-robin means each process will run for a designated time slice CFS
is a premptive round-robin scheduler. Time slices are dynamic, computed
like ((1/N) * (niceness))
Processes are managed using a
RB-Tree. Therefore, cost of launching a process or a context switch is
logarithmic Kernel will have an internal tick rate that updates waiting
threads. Lowering this will increase granularity however will increase
CPU time and hence battery time as more time spent in kernel code.
Windows scheduler uses static priorities, so one intensive process can
dominate CPU
System level refers to inbetween kernel and userspace, e.g. network manager Systemd is a collection of system binaries, e.g. udev Primarily, systemd is a service manager A service extends the functionality of daemons, e.g. only start after another service, restart on failure after 10s etc. The kernel will launch systemd init service that will then bootstrap into userspace (hence alllowing for the aforementioned service features)
Kernel offers various methods of process isolation, e.g. chroot, cgroups etc. (chroot cannot access files outside its designated tree) A container will utilise one of these options provided by the kernel to acheive: * cannot send signals to processes outside container * has own networking namespace * resource usage limits
ELF (Executable and Linkable Format) Header contains type, e.g. executable, dynamic, relocatable and entrypoint A section is compile time: * .text (code) * .bss (uninitialised globals) * .data (globals) * .rodata (constants) A segment is a memory location, e.g .dseg (data), .eseg (eeprom) and .cseg (code/program)
Typically stored as crt0.s, this will perform OS specific run-time initialisation. The conditions assumed here will be outlined in ABI, e.g. argc in rdi Some functions include setting up processor registers (e.g. SSE), MMU, caches, threading ABI stack alignment, frame pointer initialisation, C runtime setup (setting .bss to 0), C library setup (e.g. stdout, math precision) The program loader will load from Flash into RAM then call _start (which is crt entrypoint)
Storage device sizes are advertised with S.I units, whilst OS works with binary so will show smaller than advertised (1000 * 10³ < 1024 * 2¹⁰) Also, storage device write speeds are sustained speeds. So, for small file sizes expect a lot less
A flip-flop is a circuit that can have two states and can store state. Various types of flip-flops, e.g. clock triggered, data only etc. A latch is a certain type of flip-flop. Called this as output is latched onto state until another change in input.
Registers and SRAM stored as flip-flops. DRAM is a single transistor and capacitor
SRAM (static) is fast, requires 6 transistors for each bit. So, 3.2billion transistors for 64MB cache. Sizeable percentage of die-area SRAM more expensive, faster as not periodically refreshed.
DRAM (dynamic) is 1 transistor per bit refreshed periodically. SDRAM (synchronises internal clock and bus clock speed). SDRAM. LPDDR4 (low-power; double pumping on rising and falling edge of clock, increasing bus clock speed while internal typically stays the same, amount prefetched etc.)
DIIM (Dual In-Line Memory Module) is form factor with a wider bus SODIMM (Small Outline)
CAS (Column Address Strobe), or CL (CAS latency) is time between RAM controller read and when data is available. RAM frequency gives maximum throughput, however CL affects this also. In addition, RAM access is after cache miss, so direct RAM latency is only a percentage of total latency as time taken to traverse cache and copy to it.
NAND and NOR flash are two types of non-volatile memory NOR has faster read, however more expensive per bit and slower write/erase NOR used in BIOS chips (firmware will be motherboard manufacturer, e.g. Lenovo) A NAND mass storage device will require a controller chip, i.e. a microcontroller How the controller accesses the NAND flash, i.e. the protocol under which its Flash Translation Layer operates, will determine what type of storage it is: * SD (secure digital) * eMMC (embedded multimedia card): Typically SD soldered on motherboard (For SD/MMC protocol, will have a RCA, i.e. Relative Card Address for selecting card) * USB (universal serial bus) * SSD (solid state drive): Parallel NAND access, more intelligent wear leveling and block sparring 3D VNAND (Vertical) memory increases memory density by vertically stacking NAND flash
Form factors include M.2 keying and PCIe (Peripheral Component Interconnect) Interface includes SATAIII, NVMe (non-volatile memory host controller) and PCIe SATA (Serial Advanced Technology Attachment) SSD is the lowest grade SSD. A single form factor may support multiple interfaces, so ensure motherboard has appropriate chipset
Each CPU socket has memory banks that are local to it, i.e. can be accessed from it directly. NUMA (non-uniform memory access) means that accessing memory from a non-local bank will not be the same speed. A NUMA-aware OS will try to mitigate these accesses.
RAID (redundant array of independent disks) is method of combining multiple disks together so as to appear like one disk called an array. Various types, e.g. RAID0 (striping) some parts of file in multiple disks, RAID1 (mirroring) each disk is duplicate so could give speed increase etc.
Battery will have two electrodes, say lithium cobalt oxide and graphite. When going through a charging/discharging cycle, ions move between electrodes. So, charging cycles will affect the atomic structure of the electrodes and hence affect battery life.
Circuits based on conventional current, i.e. + to - Cathode is terminal from which conventional current flows out of, i.e. negative
LiPo (lithium-ion polymer) uses polymer electrolyte instead of liquid. Standard lithium-ion has higher energy density, cheaper, not available in small sizes and more dangerous due to liquid electrolyte LiPo more expensive, shorter lifespan, less energy, more robust LiPo battery is structured to allow a current to be passed to it to reverse the process of oxidation (loss of electrons), i.e. is rechargeable
Battery 51Watt/hr, which is A/hr * V is not a fixed value, e.g. 1A/hr could run 0.1A for 10 hours
Petrol cars still use lead-acid as they have lower internal resistance and so can give higher peak current then equivalent LiPo (just not for as long)
HDMI(High Definition Multimedia Interface)-A, C (mini), D (micro) carry audio and visual data DisplayPort has superior bandwidth to HDMI
USB-A,USB-B,USB-B(mini) USB-C is USB3.0
3.5mm audio jack (3 pole number of shafts (internal wires), 4 pole for added microphone)
Ethernet CAT backwards compatible.
Telephone cable called RJ11
IEC (International Electrotechnical Commission) power cords used for connecting power supplies up to 250V, e.g. kettleplug, cloverleaf
DC barrel jack
Touch screen types need some external input to complete circuit Resistive works by pressure pushing down plastic<-electric coating->glass Unresponsive, durable, cheap Capacitive contains a grid of nodes that store some charge. When our finger touches charge flows through us and back to the phone, changing the electric current read. We are good conductor due to impure water ion in us. So, things electrically similar to our fingers will work also like sausages, banana peels
1080i/p 1080 references vertical height in pixels Interlaced means display even and odd rows for each frame. Due to modern high bandwith not used anymore. Progressive will display each row sequentially for a given frame
4k means horizontal resolution of approximately 4000 pixels. standard different for say television and projection industry, e.g. 3840 pixels
Screen density is a ratio between screen size and resolution measured in PPI (Pixels Per Inch)
A voltage applied to ionised gas, turning them into superheated matter that is plasma. Subsequent UV is released.
LCD (Liquid Crystal Display) involves backlight through crystals. IPS (In-Plane Switching) and TFT (Thin Film Transistor) are example crystal technologies. For an LED monitor, the LED is the backlight, as oppose to a fluorescent. However still uses LCD backlight, so really LED LCD.
Quantum science deals with quanta, i.e. smallest unit that comprises something They behave strangely and don’t have well defined values for common properites like position, energy etc., e.g. uncertainty principle A ‘quantum dot’ is a semiconductor nanoparticle that has different properties to larger particles as a result of quantum mechanics. QLED/QNED (Quantum NanoCell) adds a ‘quantom dot’ layer into the white LED backlight LCD sandwich.
OLED is distinct. It produces own light, i.e. current passed through an OLED diode to produce light. LTPO (Low Temperature Polycrystalline Oxide) is a backplane for OLED technology.
E-ink display uses less power than LCD as only uses power when arrangment of colours changes.
HDR (High Dynamic Range) and XDR (Extreme Dynamic Range) increase ability to show contrasting colours.
5.1 means 5 speakers, 1 subwoofer In order of ascending levels of audible frequencies 20Hz-20000Hz have devices woofer, subwoofer, speaker and tweeter.
ARM core ISA e.g. ARMv8 Will then have profiles, e.g. M-Profile ARMv8-M R-Profile for larger systems like automotive
ARM holdings implements profile in own CPU, e.g. Cortex-A72 This is a synthesisable IP (Intellectual Property) core sold to other semiconductor companies who make implementation decisions like amount of cache from 8-64kb specification.
However, other companies can build own CPU from ISA alone, e.g. Qualcomm Kryo and Nvidia Denver
Then have actual MCUs e.g. STMicroelectronics, NXP or SoCs e.g. Qualcomm Snapdragon, Nvidia Tegra
big.LITTLE is heterogenous processing architecture with two types of processors. big cores are designed for maximum compute performance and LITTLE for maximum power efficiency
FFT divides samples (typically from an ADC) into frequency band A logarithmic scale typically employed to account for nonlinear values whereby a greater proportion in high frequency band
DSP instructions may include transforms like FFT, filters like IIR/FIR (Finite Impulse Response; no feedback) and statistical like moving average
With the addition of 64bit extension, ARM retroactively called aarch32. Possible instruction sets include Thumb-1 (16-bit), Thumb-2 (16/32-bit), aarch32 (32bit instructions), aarch64 (32bit instructions), Neon, MTE (Memory Tagging Extension), etc. aarch64 does not allow Thumb instructions
FPU is a VFPv3-D16 implementation of the ARMv7 floating-point architecture VFP (Vector Floating Point) is floating point extension on ARM architecture. Called vector as initially introduced floating point and vector floating point. Neon is product name for ASE (Advanced SIMD Extension), i.e. SIMD for cortex-A and cortex-R (more recent is SVE (Scalable Vector Extension)) Helium is product name for MVE (M-profile Vector Extension), i.e. SIMD for cortex-M
RSA (Rivest-Shamir-Adleman) is asymmetric, i.e. public and private key. Much slower than AES AES (Advanced Encryption Standard) is symmetric, i.e. one key SHA (Secure Hash Algorithm) is one-way and produces a digest
MPU (Memory Protection Unit) only provide memory protection not virtual memory like an MMU (Memory Management Unit)
Built atop Android OS, many phones will implement own custom OS, e.g. Huewei EMUI, Samsung One UI The ART (Android Runtime) is the Java Virtual Machine that performs JIT bytecode compilation of APK (Android Package Kit)
EABI (Embedded) is new ARM ABI, renamed as it suits the needs of embedded applications. An EABI will omit certain abstractions present in an ABI designed for a kernel, e.g run in priveleged mode From the calling convention part of ABI we can garner number of arguments until stack usage and alignment requirements.
AAPCS (Arm Architecture Procedure Call Standard): r0-r3, rest stack s0-s7 Stack 4-byte aligned, if on function call, 8-byte aligned
AAPCS64: x0-x7 (w0-w7 for 32-bit), rest stack v0-v7 Stack 16-byte aligned
Most modern ARM architectures will not crash on unaligned accesses
Shader is a GPU program that is run at a particular stage in the rendering pipeline Nvidia GPU cores named CUDA cores. AMD calls them stream processors. ARM shader cores So, CUDA is a general purpose Nvidia GPU program that can utilise GPU’s highly parallised architecture OpenCL whilst more supportive, i.e can run on CPU or GPU, does not yield same performance benefits Renderscript is android specific heteregenous in that it will distribute load automatically OpenGL has a lot of fixed function legacy (now shader based) and drivers rarely follow the standard in its entirety OpenGL ES (Embedded Systems) is a subset Vulkan is low-level that more closely reflects how modern GPUs work
Flux is an arbitrary term used to describe the flow of things, e.g. photon flux, magnetic flux
Lumens is how much total light is produced by an emitter Candela is intensity of light beam produced by an emitter Lux is how much light hits a recieving surface Nit is how much light is reflected off a surface and so is what our eyes and cameras pick up. A display will be in nits as its a recieving object as oppose to the backlight LEDs A higher nit display is more easily viewable in a wider array of lighting conditions, e.g. will combat the sun’s light reflecting off the surface in an outdoor setting Brightness is subjective, and therefore does not have a value associated to it
Sound Waves (20Hz-20kHz) Ultrasonic are sound waves not audible by humans SONAR (Sound Navigation And Ranging) used in maritime as radio waves largely absorbed in seawater due to conductiveness
Radio Waves (10Hz-300GHz) Microwaves make up the majority of the spectrum of radio waves (300MHz-300GHz) They are divided into bands, e.g. C-band, L-band, etc. RADAR (Radio Detecting And Ranging) encompasses microwave spectrum Higher frequency results in higher resolution than SONAR. Also as EM wave, much faster transmission rate. Allowing for long range transmission, radio waves bounce off ionosphere (where Earth’s atmosphere meets with space)
Infrared Waves (300GHz-300THz) Can be used in object detection. Heat is the motion of atoms. The faster they move, the more heat is produced Approximately 50% of solar radiation is infrared.
Visible light LIDAR (Light Detection And Ranging) (lasers; weather dependent) higher accuracy and resolution than radar, lower range
Ultraviolet UVA has longer wavelength, associated with skin ageing UVB associated with skin burning UVB doesn’t pass through glass, however UVA does UVC is a germicide SPF (Sun Protection Factor) is how many times longer it takes to burn than with no sunscreen. However, UV can still get through and sunscreen is water-resistant, not waterproof
Ionising X-Rays
Ionising Gamma-Rays Sterilisation and radiotherapy
ISM (Industrial, Scientific and Medical) bands (900MHz, 2.4GHz, 5GHz) occupy unlicensed RF band. They include Wifi, Bluetooth but exclude telecommunication frequencies
GNSS (Global Navigation Satellite Systems) contain constellations GPS (US), GLONASS (Russia), Galileo (EU) and Beido (China) They all provide location services, however implement different frequencies, etc.
GSM (Global System for Mobile Communications) uses SIM (Subscriber Identification Module) cards to authenticate (identity) and authorise (privelege) access
4G (Generation; 1800MHz) outlines min/max upload/download rates and associated frequencies. Many cell towers cannot fully support the bandwidth capabilities outlined by 4G. As a result, the term 4G LTE (Long Term Evolution) is used indicate that some of the 4G spec is implemented. More specifically have 4G LTE cat 13 to indicate particular features implemented.
SMS (Short Message Service) are stored as clear text by provider SS7 (Signaling System Number 7) protocol connects various phone networks across the world
Between protocols, tradeoffs between power and data rate IEEE (Institute of Electrical and Electronic Engineers): * 802.11 group for WLANs (WiFi - high data rate), * 802.15 for WPANs; 802.15.1 (Bluetooth), * 802.15.4 low data rate (ZigBee, LoRa, Sigfox)
Wifi, Bluetooth, ZigBee are for local networks. LoRa is like a low bandwidth GSM LoRa (Long Range) has low power requirements and long distance. AES-128 encrypted by default. LoRa useful if only sending some data a few times a day. LoRa has configurable bandwitdh, so can go up to 500KHz if regulations permit Lower frequency yields longer range as longer wavelength won’t be reflected off objects. Will be called narrowband Doesn’t require IP addresses. LoRaWAN allows for large star networks to exist in say a city, but will require at least 1 IP address for a gateway Sigfox uses more power.
A BLE (Bluetooth Low Energy) transceiver only on if being read or written to GATT (Generic Attribute Profile) is a database that contains keys for particular services and characteristcs (actual data) When communicating with a BLE device, we are querying a particular characteristic of a service
A QR (Quick Response) code is a 2D barcode with more bandwidth. Uses a laser reader. RFID (Radio Frequency Identification) does not require line-of-sight and can read multiple objects at once. Uses RFID tag. NFC (Near Field Communication) is for low-power data transfer. Uses NFC tag
TV standards Americas: NTSC (30fps, less scanlines per frame) 4.4MHz Europe, Asia, Australia: PAL (Phase alternate line) (25fps) 2.5MHz
MEMS (Micro Electro Mechanical Systems) combines mechanical parts with electronics like some IC, i.e. circuitry with moving parts. e.g. microphone (sound waves cause diaphragm to move and cause induction), accelerometer, gyroscope (originally mechanical)
On phone, many sensors implemented as non-wakeup. This means the phone can be in a suspended state and the sensors don’t wake the CPU up to report data
Accelerometer measures rate of change in velocity, i.e. vibrations associated with movement (m/s²) So can check changes in orientation. It will have a housing that is fixed to the surface and a mass that can move about. Detecting the amount of movement in the mass, can determine acceleration in that plane.
Gyroscope measures rotational acceleration, unlike an accelerometer which is unable to distinguish it from linear (rad/s²) Gyroscope resists changes to its orientation due to intertial forces of a vibrating mass. So can detect angular momentum which can be useful for guidance correction.
A gimbal is a pivoted support that permits rotation about an axis
IMU (Inertial Measurement Unit) is an accelerometer + gyroscope + magnetometer (teslas) The magnetometer is used to correct gyroscope drift as it can provide a point of reference
Quartz is piezoelectric, meaning mechanical stress results in electric charge and vice versa. In an atomic clock, Caesium atoms are used to control the supply of voltage across quartz. This is done, in order to keep it oscillating at the same frequency.
NTP (Network Time Protocol) is TCP/IP protocol for clock synchronisation. It works by comparing with atomic clock servers
(sign 1bit)-(exponent 8bits)-(significand/mantissa 23bits) 1 * 2² * 0.1234
PIC (Position Independent Code) can be executed anywhere in memory by using relative addresses, e.g. shared libraries The process of converting relative to absolute, i.e. query unresolved symbols at runtime adds a level of indirection
Data deduplication means to remove duplicates
TODO: testing procedures
REST (representational state transfer) is an interface that outlines how the state of a resource is interacted with On the web, an URL is an access point to a resource A RESTful API will have URLs respond to CRUD requests in a standard way: * GET example.com/users will return list of resource, i.e. all users * POST example.com/users will create a new resource * GET example.com/users/1 return single resource * PUT example.com/users/1 update single resource * DELETE example.com/users/1 delete single resource
OAuth (open authorisation) is a standard that defines a way of authorising access OAuth offers different functionality than SSH by having the ability to ‘scope’ access Typically used by RESTful services These endpoints described in ‘discovery document’: 1. authorisation-server -> authorisation-code 2. authorisation-code -> access token, refresh token 3. resource-server -> resource
RFC (Request For Comments) documents contain technical specifications for Internet techologies, e.g. IP, UDP, etc.
Udp (head of line blocking) + client server (p2p unreliable as
internet path optimised for cost/closest exchange point) + dedicated
(peoples home Internet don’t normally have high upload rates); Mix of
cloud (flexible to just turn up and down, high egress bandwidth charge)
and bare metal (fixed bandwidth rate set into price) Matchmaking, host
migration difficult as hard to measure what user has good
connection,
e.g. Whats there NAT type?
MOSFET type of transistor that is voltage controlled
CMOS technology allows the creation of low standby-power devices, e.g. non-volatile CMOS static RAM
EMV (europay, mastercard, visa) chip implements NFC for payments
Various synthetic benchmarks indicative of performace, e.g. DMIPS (Dhrystone Million Instructions per Second) for integer and WMIPS (Whetstone) for floating point
SPDIF (Sony Phillips Digital Interface) carries digital audio over a relatively short distance without having to convert to analog, thereby preserving audio quality.
The polarity of the magnetic field created by power and ground wires will be opposite. So, having the same position in each wire line up will reduce outgoing noise as superposition of their inverse magnetic fields will cancel out. Furthermore, incoming noise will affect each wire similarly Coaxial has the two conductors share an axis with shielding outside. Twisted pair wire is a cheaper way of implementing coaxial Glass fibre optic does not have this issue.
ASIC (Application Specific Integrated Circuit) MCU for specific task
On startup, copy from Flash to RAM then jump to reset handler address No real need for newlib, just use standalone mpaland/printf Some chips have XIP (execute-in-place) which allows for running directly from flash
QI is a wireless charging standard most supported by mobile-devices for distances up to 4cm FreePower technology allows QI charging mats to support concurrent device charging
QSPI can be used without CPU with data queues
Chrom-ART Accelerator offers DMA for graphics, i.e. fast copy, pixel conversion, blending etc.
LED anode is positive longer lead
5ATM is 5 atmospheres. 1 atmosphere is about 10m (however calculated when motionless) 50m for 10 minutes
MIDI (Musical Instrument Digital Interface) 3 byte messages that describe note type, how hard pressed and what channel Useful for sending out on MCU FRAM (ferroelectric) is non-volatile gives same access properties as RAM
RENDERING: rendering is the process of creating the 2D/3D model, i.e. the drawing onto the monitor ray tracing solves transparency issues, it’s just substantially slower than standard projection rasterisation (so future is ray tracing)
3D works by emulating a simplified model of how a single human eye views space
A single point of light from the sun hitting a single object will reflect in multiple straight line directions/paths
rasterisation is taking polygon and converting to pixels eventually will have to convert say 3D position to 2D (as all rendering is fundamentally this)
lens refract light to central focal point
/dev/urandom is cryptographically secure pseudo-random number /dev/zero
creating website: https://threadreaderapp.com/thread/1606219302855745538.html
TODO: what is blockchain and web3?
I decided to try and implement ctime in Bash for pedagogical
purposes. My first task was to write and read a binary file. Googling
how to do this in Bash returned the consensus, “use another language”.
Often when I read this, I’m not deterred. I have encountered similar
naysayers before when it comes to directly using Xlib. However, when it
came to wanting structures to read and write to (essential for binary
work), I found Bash was empty. With this understanding, I realised more
generally that the usefulness of scripting languages are limited.
Specifically, they should be limited to basic tasks that involve
searching or external tool usage. I suppose my biggest use case of Bash
is for enhancing my terminal (.bashrc) and vim (:.!
)
interactions. C is a simple language and if you understand it’s
low-level capabilities, you can do many things. However, despite this
recognition, I did come upon certain procedures to follow when working
with Bash scripts.
# Checking arguments as callee
[ ! $# -ge 3 ] && return 1
[ "$1" = "-arg" ]
# Checking arguments as caller
func || exit 1
# Function returning value
_FUNC_NAME=val
# Informative usage information
printf "App v1.0 by Ryan McClue\n" >&2
printf "Usage:\n" >&2
printf "app -arg <value>\n" >&2
The diverse software landscape of Linux distros means that we don’t know what version of each
STM32CubeMX (this justs generates code, IDE is full fledged) to download, must get link with email. STM32CubeIDE is woefully slow. Maximising to full screen just blurs out half Once installed, a series of pop-up menus just keep appearing spontaneously as it has to download more to satisfy a simple create project To download qtcreator well known bug that it selects the wrong mirror, 3MBps to 30kbs. Have to decipher command line arguments and mirror parsing, e.g. ./qt-unified-linux-x64-4.3.0-1-online.run –mirror http://ftp.jaist.ac.jp/pub/qtproject
My god, installing cubemx is awful. require email. password setting for account fails.
QTcreator does not honour system .gdbinit file, have to manually set breakpoints and dissassembly flavour QTCreator doesn’t show console output. Tick ‘run in terminal’ crashes on startup
Before even using the program, installing is bad. Install pandoc. It
requires a pdf engine, e.g. xelatex that must be separately installed
Which name is not found with traditional
sudo apt install xelatex
, it’s in
texlive-xetex
and is 100MB… Searching for the appropriate
package name to install a missing dependency is what I use ubuntu stack
exchange mostly for…..
Firefox just forces restart
gpus have introduced a whole host of undocumented NDA annoyances
sound output varies significantly across different hardware. so, to detect possible sound bugs we may need to invest in a good piece of sound hardware
in linux, x11 is mixer for drm, pulse a mixer for alsa (more flexibility with pulse, e.g. can disable it if only needing one application at time, or can redirect it to another sound card) audio is a mess in linux
due to disconnected nature of x11 and scheduling, must vsync. sleeping isn’t an option due default to round-robin scheduling policy (switching to real-time scheduling is not suitable for general-os). we could improve by setting niceness (however must be sudo) as not real time, we can’t control a lot of things, e.g. could be USB latency, adobe photoshop in background, etc. on say raspberry pi could program GPIO pins to have a accurate poll for a joystick.
sound on linux is absurd (tribal knowledge). even the people on the mailing list don’t know how it works. constantly saying, don’t do this, use a library… often times, source code is the documentation, so good luck spending months on how to operate low level. return when threading is involved.
x11 is more complicated than it needs to be. like many OS commands, it should do a lot of the dirty work for you, as no-one knows all the minutiae. it’s bad to think that OSs are getting harder to work with. however, it is unfair to pick on X11, as their are other apis (even modern) that are just as bad, even worse. x11 is particular annoying in that if it crashes, even launching a virtual terminal may not work as keys aren’t registered
if only eclispe cdt debugger was good. it terminates for no reason when running (can sometimes be fixed by switching to run mode and then back to debug). sometimes have to restart to fix. run-to-line doesn’t work when inside sub-routine. memory browser crashes on entering address. doesn’t give information on stack overflows.
firefox will unexpectanctly say we updated in the background and you must restart or tab crashed etc.
Online help forums for QTCreator useless. In reality, only core devs would be able to answer your question thoroughly So, this is where I see the benefit of mailing lists
Often Ubuntu audio mixer just stops working and generates fuzz and have to restart
Firefox facebook video chat. For some reason just mysteriously stops working. says to restart, did that work, nope! both rev every fucking second is probably why.
Unfortunate just resign yourself to choosing the software with fewest or manageable ‘quirks’ (bugs to offensive…)
How on earth can something so frequently used like TeX have such poor/cryptic parsing error information Furthermore, pandoc doesn’t treat text as text. If different file extension, will perform different parsing e.g. inserts ‘hidden’ YAML header if markdown file
FreeCAD have to click in particular order for symmetry, e.g. point-line-point
QTcreator so complicated to run cross-debugger, have to write down steps
Tim Berners-Lee envisioned a universal place for information exchange. Unfortunately the freedom of creativity has resulted in much of the Web being intrusive and hindering
Order of arguments matter in docker, for some reason –rm must be before -it
default firefox credential store overwrote bitwarden password. ugh
Have so many Ubuntu variants when in reality just different packages installed (possibly different kernel parameters), e.g. mate, xubuntu, lbuntu, studio
Ad division of companies seem to have greatest control over design
Go on a news website and try to watch a video. after watching an obtrusive ad, video starts, skip to 30% of way, the same ad repeats…
240V/50Hz mains.
oscilloscope default noise is mains (200MHz, 1Gsamples/sec as oppose to multimeter which is maybe 10samples/sec so really only applicable for perhaps a logic gate or 0.1hz square wave)
Ensure BNC connector is plugged in correctly (affect probe compensation; similar to banana plugs in multimeter not being plugged in correctly)
IMPORTANT: ensure offset dials are correct first, i.e. at 0 so 0 is centre with menu, end-arrows can still be pressed even if not visible change to 24mega points for memory when just wanting wave length (not zooming in?)
Continous triggering enables us to view from start point, i.e. static image not free flowing. Will show before and after trigger point, i.e. start in centre of screen
single-shot triggering contact bounce first then ringing as stablises (makes it a balancing act between selecting triggering level for button press and release) verify signal ringing (e.g. clock signal), i.e. inspect ramp-up/down (measure time to completely bottom out)
Normal-mode triggering the best of both worlds (will show black screen unless triggered)
No real issue with oscilloscope blowing up if testing battery powered/isolated DC power supply. With USB powered, ground must be on ground! Otherwise will short USB and that port will probably break
Using RS232 (recommended standard) decoder functionality (there is also SPI/I2C decoding). https://www.youtube.com/watch?v=SarsWOCMvjg&t=76s Also investigate PWM
math -> decoder on; event table on (make sure zoomed out enough to view multiple packets (this will increase memory automatically?); increase baud also)
Bus Pirate has PIC as a SOIC (surface outline integrated circuit) package
Multimeter measure power consumption of MCU? (stm32 nucleo boards have convenient IDD jumper)
programming in AVR assembly actually made me think fondly of modern technology (a rare feeling indeed)
TODO: MAKES THESE SUBSECTIONS OF PROJECTS
TACKLING DIFFICULT PROBLEM – HOTLOADING FILE MODIFICATION TIME BEING READ TOO EARLY. MISINTERPRETING TIMESPEC NANOSEC
SREG NOT BEING SAVED IN AVR INTERRUPT
linux raw ALSA can fall into the trap of being so niche and difficult it is neck beard inducing
use of set -x in bash scripts will exit if subprocess returns an error, even if 2>/dev/null. Cryptic!
xlib scaled, vsync, refresh rate
ell library read test files for documentation
linux input analysing SDL2 source
polling multiple keyboards lag due to bug in gnome. inspect gnome git switched to desktop environment to lxfe 4. also makes creating shortcuts a breeze
Starting from scratch, try to avoid analysis paralysis
Pragmas and binary searching codebase to check where compiler optimises routine wrong
signed comparison/branching/add/sub specifics and code flow jumps in assembly
copy-pasta from UDP socket attempting to access like a TCP. doesn’t fail, just hangs on read()
break inside of for loop thinking inside of switch having a assignment instead of comparison inside of if unsigned - value causing overflow thereby giving larger than actual mul instruction copying to r0, r1
compiler giving wrong storage class for function, even though the issue was an unmatched closing parentheses. example of earlier syntax bug, giving other false-errors
ALWAYS ENABLE ADDRESS SANITISER! (look at niagara user github page) u8_cursor += byte_counter SHOULD-BE u8_cursor = file_mem + byte_counter;
Having -O2 made program crash as was using memory of stack that in debug mode was never modified
https://c3.handmade.network/blog/p/8486-the_case_against_a_c_alternative
instead of feature, we introduce a patented terminology like user stories to have someone teach you them. slows down development business logic …. just means high-order operations in say main()
Incessent unit-testing, why not test startup assembly then. Falls apart… What not to test, e.g. assume that hex_to_bin() simple enough to work? Introducing formulas to determine whether or not to automate something….
certain ‘design pattern’ enforce really long names to conform to pattern. any competent programmer can read use-case specific functions with clearer names “test_CommandHardware_CheckForMsg_Should_GetCharAndAddToPacker_When_BytesAvailable”
file for every test, file for every class. leads to awful build times
How can software better serve humanity? e.g. bloatware causes slowness for aus post workers
With UML if doing sufficiently complex to the same level as a blueprint, may as well have just written the program the idea of iterative design would be followed by literal architects, however too costly and time consuming. with software, we have the ability to do this. 1. design (urban planner): separation of code (mental clarity and division of labour) design metrics are temporal coupling (physics outputs data to renderer, so format is important), layout coupling (renderer inherits from opengl), idealogical coupling (threading, width, memory), fluidity (one change to system causes a major crash) 2. programming (architect) 3. compiler (builder)
Most design patterns are just utility classes rather than a way to architect a program
I don’t want to fight the language (Java). Higher level languages should allow you to easily express cpu instructions
reject the idea of TDD driving good design. accept that tests validate design. tdd and bdd have good elements in them, but the dogma is not effective.
code shouldn’t take longer than 10 seconds to compile.
in general we don’t add security threats that weren’t already present, e.g. loading from shared object could just as easily override binary if have write priveleges to both
even though a ‘new’ language won’t crash, it can still have buffer overruns. NullPtrExceptions
c++ struct functions implicitly have a this pointer. virtual functions will result in a vtable (array of function pointers) to be generated for that struct. therefore, virtual function will first go to vtable, then lookup in function, so double indirection (so not a zero-cost abstraction) normal function call just call 0x1234, however with vtable mov rax, qword ptr [rsp + 20] etc. (dereferencing pointers)
an engine makes things that aren’t likely to be difficult easy (except for linux…) important to know low-level to write new tools. we don’t have a wheel yet.
for any non-trival task, scripting languages become a hindrance with no static type checking, no real debugger, slow, not as capable. there are complications in software we have wrongly convinced ourselves are necessary, e.g. scripting for hotloading hotloading C is far superior, as C is more powerful and can use same debugger (using Lua is a downgrade) build systems can be useful as they allow for incremental builds (however, in negatively reinforces people to only make small changes) (speed increase may only be noticeable for large, complex code base) they are also useful for managing cross compilation (libraries have to be pulled in and compile also) the idea of incorporating a scripting language into a game was a failed experiment in 2005s. things like a visual based interface is fine as it is constrained
with closed source it is often the case that a company employs someone to oversee the experience of the software and have q+a therefore, better quality with this higher layer of checking than open source.
the best bet in safeguarding security is to reduce the attack surface, which means to reduce the number of lines of code.
Don’t restrict right side of bell curve Let your aces be aces Being an ace involves having an opinion Most influential software written largely by one person, e.g Linux, Unix, git etc. Then a team is assigned to maintain it. Fallacy about solo programmer productivity requiring large teams. Design by committee pushes design to middle of bell curve as opposing views average out
templates add complexity in the debugger (no actual names). only really useful if you save a ton of code (not much code to just write each implementation) if templates are necessary then just use meta-programming as it is much more powerful.
many programs treat memory as an infinite resource. allocating memory is introducing a failure case and making not a fan of allocation festivals. we can create our application with minimal failure cases (cannot do this with the platform layer).
uml and diagrams in general are a waste of time (its just code you would write and often fails to capture subtleties) you should become more proficient at reading code and understanding its relationship.
oh no, we had a security bug in our development version! (printf and friends. printf %f defaults to double)
understanding history is important; c runtime library way of packing return fail information is the reason for inverse truthniess
downloading unverified external tool, always good to get more viruses on my machine… unless playing at EVO with some maxed out razer device, not feasible to hit that hard I guess RTFM is the answer to that…
build tools are more of a hindrance! always asking yourself what flags are being passed (linker and compiling separate steps), what files are it picking up, what is the CWD, etc.
Go against merge requests from strangers and just auto let a group of trusted people. This avoids the problem where you work for hours only to have it blocked
Many online communities are anti-engineering in that they don’t embrace criticism.
Do anything on web takes a lot longer than it should dealing with a myriad of software with different odd conventions. Lack of functionality/integration with hardware will lead to collapse Many features lacking like type system, try to emulate. Many features have like garbage collection try to avoid.
Scripting languages can cause heap fragmentation. Why just use a real language as we want robustness (scripting languages dependent on interpreter speed) and type checking
Fundamental lack of awareness that there is a better way to program. We all make slow because of lack of time however to say it can’t be done is a fallacy. The cultural differences with these people make it a fool’s errand to try and get these people to program correctly e.g. time visual studio users think is fast is less than 10 seconds?!
Const rarely finds bugs that I have, i.e..writing to a variable a shouldn’t. In saying that, you should use features of language that helps you catch bugs.
Apple store is hardly a free marketplace. They can just block your app for any reason
DRM and engine usage make using games in the future difficult (e.g. museum)
Also, the excessive testing is pushed by web where the poor languages dictate heavy testing. Testing first makes no sense as the app may change
Audiophiles think they hear things that aren’t there
Much like the food industry has organic vs processed, we need a term for games that are made by people who love games and care about the experience as opposed to large companies concerned with making money (indie vs triple a)
Sometimes crashing is good because it signifies the serious problem that can be rectified immediately rather than some other Insidious hidden bug The problem is not mapping id/pointer (or whatever std:: c++ people would use) to an entity correctly (i.e. correspondence problem). the symptom of this problem will be different depending on the implementation, e.g. pointer will crash program.
Crazy nuttiness of command line parsing UNIX. No one remembers single commands except privileged ls. Plus sign turning off, minus on, etc.
The flip side of high level languages is loss of capability in controlling the cpu Run time languages are slower and more complex Most scripting languages aren’t designed for modern hardware, e.g. simd, threads
Scripting once, deploying everywhere is broken. You must test what you write on each machine. Computers aren’t wonderful abstractions we wish they were
C was created to solve the problem of a portable high level assembly language for UNIX. C++ is a frankenstein languAge with bjarne just adding features in, e.g. he just wanted c with classes. Go and rust were designed for specific purposes, although I don’t care about safe memory features or want garbage collection. Complexity creates bugs. C++ incredibly complex. People just stick to a subset of c++, which may be different to yours. Exceptions defer error handling and also bring about an ethos of more errors when they should just be states of your program.
roughly 90% of games played on PC games are pirated. however intensive anti-piracy makes it harder for people to play your game yet, in line with this is, there is no code of ethics in computer science for wasting peoples time. so, get more respect from community? drm just introduces another way for the software to waste people’s time (however, may come a day when it’s required…)
Successful ideas are higher level languages to assembly and compile time type checking Success is solving a problem people have talked themselves out of solving Why does Twitter need 4000 employees? Spacex is roughly the same and the put rockets on mars! They make problems difficult by the engineering culture they create. 18000 classes overflowed java function pointer buffer, they are deluded in thinking this way Over concerned with Uml, state diagrams, acronyms, etc. Nightmarish distractions/unproductive
High level don’t worry about implementation details - chrome c++ entering a 1 char created 25000 memory allocations with std::strong Abstractions - more code makes program as a whole more difficult to understand. Abstractions are just hiding low level which is the most important part and will constrain the system. Java huge call stack to make a http request. Make lightweight abstractions only Functional programming - Haskell around since 1990 hasn’t taken over. Imperative more clear. Functional breaks down over non trivial work Data hiding - poor cache performance, redundant code Excessive inheritance - poor cache performance Exceptions - constant cognitive tax on remembering what throws what and huge verbosity doing so. Also don’t know what program will throw during some time Commenting - write good comments and garden them as they can rot easily Re-use - only effective when used to small extent, humans are bad at understanding layers upon layers. You don’t make something better by making it more complicated All aforementioned ideas can be contextually useful
The backbone of web tcp/ip has scaled tremendously well over 60 years. The web software stack is opposite. Browsers rev every 15 minutes, everything is worse than it was before, JavaScript slower than anything before it. Nodejs, php etc. Bourne out of good idea to mAke a particular type of software development rapid. However they fail to effectively utilise system resources to solve a problem that would be easy in C. Often people are only concerned with how fast they can write this, without caring about quality. In the web sphere google, Facebook etc. Know their competitors aren’t going to be concerned with quality, e.g. gmail is incredibly slow and littered with bugs (typing in a name gives wrong result as it hasn’t finished time to server) so they use these technologies that help to get something working quickly but will be janky. I want to write software where quality matters and software is enjoyable to use. Web needs to acknowledge this. They have different thinking, e.g. network latency so we don’t have to care about performance, no you need more effort to avoid this latency! Although different tradeoffs made for different contexts, on a whole programming is programming. The web is almost impossible to write good software as have to deal with complexities that don’t have to be there. Best minds are being put to make people click ads
Why is it big news regarding command prompt. Battlefield orders of magnitude complex and runs faster Msdos could see all physical memory; Modern cpus have mmu
must be able to judge the quality of something from your own criteria, not simply on social cues/norms
memory safety not of that much concern for programming. however, for security I suppose it is? wish security wasn’t an issue https://github.com/KULeuven-COSIC/Starlink-FI/
various daughterboard accoutrement nomenclature, e.g. arduino shield, beaglebone cape, raspberry pi hat, ST Zio/Morpho connectors etc.
discordant packaging naming in ubuntu ‘dummy packages’, e.g. qemu is package name, however binaries are named differently.
still installing an OS is a cross-your-fingers exercise. A single USB presents multiple boot options. choosing between uefi 1.0 and 1.0 one will give you WIFI during install ubuntu forums have contemporary posts mentioning install issues this finicky nature means just have barebones things and install separate OS in virtual machine chose xcfe4 for gnome-repeat-bug and easy shortcuts
the lack of type systems in Javascript and Python leads to extensions with it
Vulkan like filling in a centrelink form, so much setup
Any overarching idelogoy like everything is objects, everything is data is pretty arbitrary. Certain approaches are useful for particular problems. I’m sensiiltive to Input lag in text editor, so use vim. Intellisense pop ups slow
Although large groups can help maintain, the introduction of many can create imperfections This extends to all software companies. Do a lot of good things, but will always have things that are dysfunctional at, e.g documentation, disparate build systems, non-uniform design practices Don’t restrict right side of bell curve Let your aces be aces Being an ace involves having an opinion Most influential software written largely by one person, e.g Linux, Unix, git etc. Then a team is assigned to maintain it. Fallacy about solo programmer productivity requiring large teams. Design by committee pushes design to middle of bell curve as opposing views average out
Loss of generational knowledge, i.e. people forgetting gensis of influential software written by individuals
Growth of machine learning due to quantity of computation available. Localised improvements, overall degradation (rendering, update processing etc. far more involved) Wrong perception that performance takes too much time, so we want do it but we could do it. However, if they have never done it, how could they say they could? Growth of startups are just finding a niche, rather than writing revolutionary software
Testing nomenclature so diverse, yet could be boiled down to a handful of terms, e.g. Regression testing just re-running tests on new changes, which is something you do implicitly
JSON is not even robust enough to allow trailing commas that even C89 arrays support …
Now have ‘memory leak’ finder tools for interpreted languages like Javascript…
IPv6 classic example of second-system effect
Windows have to enable hidden file extensions to not enter a .txt file
Although I’m in favour of an OS wrapper, these don’t give you all the control you need and you end up having to write your own OS specific code
Unfortunately can’t even use modern GNU AVR assembly with numbered labels, advanced macros and location counter
C stdlib BUFSIZ macro vastly different across OSs, e.g. 1K, 8K, 512bytes
Concept of non-daemonic and daemonic ‘threads’ in python
w3m-img installs to non default path /usr/lib. Have to resort to X11 python ueberzug
difference in size between C and CPP header files for vulkan SDK is huge, leads to much slower compile times. contains some useful things, but not that useful
Unfortunately software like npm is poorly designed so will inevitable get ‘fast/lightweight’ variants
Many tutorials on low-level like Vulkan, X11 can be wrong so need to understand spec
Fibres are green threads, i.e. not OS or hardware threads so not actually faster. They just allow interleaved execution which is used in web to not freeze UI on large linear execution
Unfortunate trend on Internet for people to ask why aren’t you using what I’m using, e.g. VScode, as oppose to asking why are you doing things your way Tech news propagates shallow information fast. In aggregate, most of it not important, e.g. new bright blueberry distro release
The distinction between new as in fashion and new as in forward progress can be overlooked in tech. Unfortunately, most is the former
Despite Wayland (and Mir) being far more sane (merging display server and client into one), still not fully supported
Being able to sift through low-quality tech news essential, e.g. ‘new’ database library, css library, VR headset, distro, container library etc. Amazon, Tesla, Twitter, Netflix endeavours Cancer, alzheimer trial drugs Fusion energy
Have to update Ubuntu version if want to easily get compiled version of modern gcc
Unfortunate mnemonic repetition of nonsensical statements in programming, e.g. manual memory management is hard (overly simplistic and misleading)
Auto, i.e. type inference is good when the type is not actually known. However, often used in replace of a name that is too long to type, leading to unclear types and over complication
Virtually every week a new ‘container-specifier’ project appears on Github
C grew out of existing code. All new languages trying to be a top-down design on the opinions of a single programmer (BDFL) Just have a simple language with ideal metaprogramming/code generation facilities so the programmer can decide, not the language designer
Whilst github codespaces seems cool, the reason for having some many containers is that the build systems for each project is incredibly complex and requires many dependencies
Sometimes have to rephrase something for the compiler to generate more optimal code, e.g. div /= 2; div /= 2; will generate more efficient instruciton that div >> 2 for avr_gcc
The advent of AI assistants just gives you more settings to disable at factory reset time on a phone 2022, modern Samsung galaxy S22 ultra won’t even recognise USB to transfer files on Ubuntu and Windows …
With security conscious individuals, surely more dependecies is bad from this viewpoint. No matter…
With the proliferation of web technologies, now say AOT (ahead-of-time) compilation for what was normally just compilation
Reason for GUI proliferation is that installation procedures are so complex, e.g. install and use esp-idf requires python virtual environment
vinegar and bicarb for walls and drain cleaning
cool idea of SDR (software defined radio) and RF analyser https://greatscottgadgets.com/sdr/1/ https://github.com/ainfosec/FISSURE
how does quantum computing work?
what frequencies/technologies are regulated like in ISM; by extension what other regulations are there?
why floating point inaccuracies
examples of hardware virtualisation instructions/codecs? (e.g. multiway MMU?)
vulkan renderer: https://www.youtube.com/watch?v=BR2my8OE1Sc&list=PL0JVLUVCkk-l7CWCn3-cdftR0oajugYvd&index=1
how does something like red hat and android get around gplv2 of linux kernel
the SIG introduces bluetooth 5 which states higher data rate, long range BLE etc. are their decisions informed on techonology advances (seems not to be the case with 4G LTE…)
Understand memory alignment and cost of unaligned accesses? (ABI defines alignment of C types?) (is it due to common programmer workflows and less transistors required?) for modern hardware, trying to read data from an unaligned memory access can result in 2 reads and a combine so, there are custom compiler options to specify alignment (declspec() for stack, aligned_malloc for heap) why not just always automatically align things? conserve space? when to use these alignment extensions? when doing performance timing/tuning?
is distinction between a unicast (device to device), multicast (device to some devices), broadcast (device to all devices) only discernable in packet format, i.e. must be parsed by network card first? are signal strength changed?
Software laws, e. G. Have 4k in name, but image quality 1080p, Elon musk making false claims, Most dash cams same SoC, and image sensor, just different housing Garmon products to use must sign up for garmin account and agree to collection of data what are laws for misinformation, e.g. tesla, ‘4k’ dash cams, stats misleading: M1 chip faster (only faster than Intel based macs), we don’t track in ‘traditional’ ways, tiktok keylogger technology exists but not using
TODO: favourites tab for morning viewing videos: common sense skeptic; techquickie; handmade podcast; network next
Resident Name Unit Number / 17 High Street University Terraces Kensington NSW 2033 Australia
chefgood: 20 x $209 mymusclechef: 20 x $219 WELCOME20
dinner + lunch: $220 breakfast: $100 rent: $345 phone: $70
leftover: $265
salary: ≈$52,000
fortnightly: $1400 + quarterly: $4000 = weekly: $1000
communicating with busy people: https://threadreaderapp.com/thread/1562510420644343810.html
thread local storage could be implemented hyper-threaded cpu?
computer science papers https://blog.acolyer.org/
QUESTIONS:
security so vast and not something I want to devote time to: https://leveleffect.referralrock.com/l/JOHNHAMMON07/
the constellation we can’t see as it’s blocked by the sign is the zodiac sign for that month not new territory to regain interest in the past, e.g. renaissance grew interest in ancient greek astrology zodiac latinized of zodiakos, meaning circle of animals (all zodiac names latinized greek) vernal is beginning of astronomical year; daylight starts getting longer until summer solstice. hence, spring is when zodiac starts aries (march 20th) -> ram taurus (april) -> bull gemini (may) -> twin brothers cancer (june) -> crab leo (july) -> lion virgo (august) -> woman libra (september) -> scales of justice scorpio (october) -> scorpion sagittarius (november) -> archer’s arrow capricorn (december) -> goat aquarius (january) -> water bearer pisces (feburary) -> opposing fish
IS IT POSSIBLE THAT AN ASSEMBLY INSTRUCTION LIKE RDTSCP COULD BE TRAPPED BY PROGRAM LOADER? https://eli.thegreenplace.net/2011/01/27/how-debuggers-work-part-2-breakpoints
NOTE(importance of reading programming papers…. after handmade finished)
vector math routines (obtaining cross product from column vector form) when drawing vectors in a physical sense, keep in mind they are rooted at the origin (even if drawings show them across time) whenever doing vector addition/subtraction, remember the head-to-tail rule (their direction is determined by their sign). could also think that subtract whenever you want to ‘go away’ from something dot product transpose notation useful for emulating matrix multiplication unit circle, x = cosθ dot product allows us to project a vector’s length onto a unit vector dot product allows us to measure a vector on any axis system we want by setting up two unit vectors that are orthogonal to each other simple plane equation with d=0 will be through origin (altering d shifts the plane up/down) cross product gives vector that is orthogonal to the plane that the two original vectors lie on (length is |a|·|b|·sinθ). So, really only works in at least 3 dimensions with units, e.g. for camera, start with arbitrary ‘unit’ defintion. later move onto more physical things like metres by applying a scaling factor to direction vector, can move along it world space coordinates. camera position is based on these. the camera will have its own axis system which we determine what it should be and then use cross product based on what we want understanding dot product equivalence with circle equation for multiplication of vectors, be explicit with a hadamard function (IMPORTANT have reciprocal square root approximation which is there specifically for normalisation. much faster cycle count and latency than square root)
noise is randomness. white noise is complete randomness. blue noise (harder to generate) is randomness with limitations on how close together points can be (more uniform)
pride cometh before the fall! six of one, half a dozen of the other. have our cake and eat it too ad infinatum
Borrowing money: clear you’re the right person to give the money to. Clear you understand what you’re doing and the game process is figured out. Proof of you will complete it. Here is the game, why I’m good at it, why people will like it and why I’m going to succeed at developing it Interview ability to explain problems you have encountered on projects
time -p; getrusage();
callbacks less CPU intensive than polling
Saying one instruction is faster than the other ignores context of execution. e.g. mul and add same latency, however due to pipelining mul execution unit might be full TLS vs atomics, e.g. TLS is series of instructions determined by OS and compiler. Atomics depend on how other cores are run and synchronising necessary with other cores So, must measure which is faster for particular situation
use $(time) for single line, use $(ctime)
Can pin a thread to a core
Can run programs in kernel space with eBPF
Frequent context-switching will give terrible cache coherency
adding restrict
also useful to prevent aliasing and
thereby might allow compiler to vectorise say array loops
Before cpus increased in single thread execution speed. Now more cores. It’s a topic of research to convert single threaded into multithreaded for emulation. This is why emulation of something like the GameCube (powerpc) is slow. Furthermore, due to hardware irregularities that programs relied on may take hundreds of instructions to emulate If actually a simple translation, then should run close to native speed. This is reality of emulating hardware with hardware
CISC gives reduced cache pressure for high-intensive, sustained loops
log2(n) number of bits for decimal https://en.algorithmica.org/hpc/cpu-cache/associativity/
using genetic algorithm/machine learning to optimise for us https://zeux.io/2020/01/22/learning-from-data/
Cpu try to guess what instructions ahead (preemptive). Cost of incorrect reflushing expensive. So want to get rid of conditional jumps. Ideally replace with conditional movs or arithmetic branch less techniques. Endianness (register view), twos complement (-1 all 1s) Branch less programming is essentially SIMD
If variable clock speed, cpu could detect not using all cores and increase single core clock
Not memory bound is best case for hyper threading Intel speeds optimised for GPR arithmetic, boolean and flops Intel deliberately makes mmx slow
Low cache associativity means fast lookup but a lot of misses and thus eviction policy like LRU
Count cycles to counteract possible thermal throttling Hyperthreads useful only if different execution unit Cpu reads memory from cache and ram in cache lines (due to programmer access patterns). Each item in cache set is cache line size
if apple computers use RISC ARM in M1, why CISC necessary? (only because on Intel?) emphasis of CISC simplify assembly (e.g. more addressing modes), thereby reducing size of binary? (reduce instructions) and increase cache coherency RISC will require less transistors to implement complex hardware but will make optimising harder for compiler? single cycle instructions (reduce cycles per instruction)
when looking at a pointer, to optimise compiler must know whether it can assume it points to a local var or not. so, easier to eliminate aliasing with non-pointers
when viewing from application in a sandboxed environment like a phone, total RAM less than installed as portion reserved for kernel
simplistically RAM 50ns and HDD 10µs? faster to read than write
2.5bln * 8 (simd) * n (execution units) * 2 (cores) assuming instructions have throughput of 1 so 64 / 4 gives how many floats per second from L1 cache in general, not streaming from memory the entire time (would probably hit a cache bandwidth limit)
// undefined behaviour if not true ALWAYS_INLINE void assume(bool cond) { if(!(c)) {__builtin_unreachable()}; } times when manual inlining is required: https://www.youtube.com/watch?v=B2BFbs0DJzw
Unless your in web where everything takes years to load, as single threaded performance is largely stagnant you will have to utilise parallelism if want performance. This is very difficult and like single threaded code generic libraries which can actually be very specific will create bloated in performing code based. Multithreading is building up a new discipline to single threaded. There are a lot of pitfalls for performance (balancing want things local, however must share to utilise) If you care about performance for anything, you should care about cache misses Memory bandwidth and caches are major reasons for a cup attaining performance. You have to think about where does memory actually live and how is it transferred around
Optimiser allows for lexical scoping of stack variables. However, for optimiser to inline will have to get rid of pointers to prevent aliasing
stack and heap memory are the same physical thing, so only more efficient if memory was hot, i.e. recently touched
Computer faster than you think, e.g instructions, clock cycles, cores. Very large With M2 drive should be quick
may be easier for optimising compiler to work with things passed by value as oppose to pointer. so, if need to modify something may have to return the value in a functional programming style. however, this is error prone. compiler cannot optimise pointers, e.g. if setting two values to same pointer, compiler cannot say that both the values are the same as another pointer may point to the same address. this is aliasing.
we see contiguous memory in virtual memory, however in physical memory it is almost certainly going to be fragmented
compilers can auto-vectorize loops for us (and other operations if we perform say 20 of them). so, floats will be twice as fast as doubles (more space, even though same latency and throughput)
optimising is a very precise process. only do when you have code that is working and you know will keep. games by their very nature are about responsiveness, so optimisation and low-latency is important. I like programming with this as a mentality.
with pure compiler optimisations, i.e. code we have not optimised ourselves, a 2x increase is not unexpected. (code we optimised not as much)
optimise for worst case (looking out on whole world) not best case (in front of wall, don’t render what is behind). we care about highest framerate, not lowest.
virtually never use lookup tables as ram memory is often 100x slower (so unless you can’t compute in 100 of instructions)
When many people say too much effort involved in optimisation, they are generally thinking of point 1
sometimes things should be running faster than they should, however only so much we can replace e.g. the structure of many OSs are based on legacy code, so simply outputting to stdout may go through shell then kernel etc. also, may have to deal with pessimised libraries. in these cases: * isolate bad code, i.e. draw hard boundary between your code and theirs by caching calls to them * do as little modification to the data coming in from them (no need to say put in a string class etc.)
People think that it’s slow, but it won’t crash (because of interpreter) Performance is critical in getting people excited about what you do, e.g. windows ce was laggy, iphone performant but less features changes the market low latency is more desirable, even if less features
count number of math ops in function and control flow logic that is evaluated branches can get problematic if they’re unpredictable
inspecting measurements of microarchitecture consider latency and throughtput (how much does it cost to start again) so, FMUL may have latency of 11, however throughput of 0.5, so can start 2 every cycle, i.e. issue again (cpus these days are incredibly overlapped even if single-threaded) so throughput is more of what we care about for sustained execution, e.g. in a loop
however, these numbers are all assuming the data is in the chip. it’s just as important to see how long it takes to get data to the chip. look at cache parameters for microarchitecture; how many cycles to get from l1 cache, l2 cache, etc. when get to main memory potentially hundreds of cycles
bandwidth of L1 cache say is 80 bytes/cycle, so can get 20 floats per cycle (however, based on size of L1 cache, not really sustainable for large data)
http://igoro.com/archive/gallery-of-processor-cache-effects/
(could just shortcut this and just see if flops is recorded for our chip)
so, using these rough numbers we should be able to look at an algorithm (and dissect what operations it’s performing, like FMULs), know how much data it’s taking, and give a rough estimate as to how long it could optimally take (will never hit optimal however) IT’S CRITICAL TO KNOW HOW FAST SOMETHING SHOULD RUN
REASONS SOFTWARE IS SLOW: 1. No back-of-envelope calculations (people aren’t concerned that they are running up to 1000 times slower thn what the hardware is capable of) These calculations involve say looking at number of math ops to be performed in the algoirthm and comparing that to the perfect hardware limit 2. Reusing code (20LOC is a lot; use things that do what you want to do, but not in the way you want them to be done; often piling up code that is ill-fitting to what the task was, e.g. we know this isn’t a regular ray cast, it’s a ray cast that is always looking down) 3. When writing, thinking of goals ancillary to task (not many places taught how to actually write code; all high level abstractions about clean code there are thinking about templates/classes etc. not just what does the computer actually have to do to do this task?) (there is no metric for clean code; it’s just some fictional thing people made up)
WHENEVER UNDERSTANDING CODE EXAMPLES: 1. COMPILE AND STEP INTO (NOT OVER) IN DEBUGGER AND MAKE HIGH LEVEL STEPS PERFORMED look at these steps for duplicated/unecessary work (may pollute cache). (perhaps even asking why was the code written the way it was) could we gather things up in a prepass, i.e. outside of loop? if allocating memory each cycle that’s game over for performance. do we actually have to perform the same action to get the same result, e.g. a full raycast is not necessary, just segment on grid O(n·m) is multiplicative, not linear O(n). big oh is just indication of how it scales. could be less given some input threshold (big oh ignores constants, hence looking at aymptotic behaviour, i.e. limiting behaviour) now, once code reduced, look at minimising number of ops
cpu front end is figuring out what work it has to do, i.e. instruction decoding
e.g simd struct is 288 bytes, 4.5 cache lines, able to store 8 triangles
understanding assembly language is essential in understanding why the code might not be performing well
branch prediction necessary to ensure that the front-end can keep going and not have to wait on the back-end
execution ports execute uops. however, the days of assembly language registers actually mapping to real registers is gone. instead, the registers from the uops are passed through a register allocation table (if we have say 16 general purpose registers, table has about 192 entries; so a lot more) in the back-end this is because in many programs, things can happen in any order. so to take advantage of this, the register allocation table stores dependency chains of operations (wikichips.org for diagram) from execution port, could be fed back into scheduler or to load/store in actual memory
when looking at assembly, when we say from memory, we actually mean from the L1 cache
xmm is a sse register (4 wide, 16 bytes); m128 is a memory operand of 128 bits ymm is 8 wide 1p01 + 1p23 is saying issue 1 microop on either port0 or port1 and one microop on either port2 or port3 so, we could issue the same instruction multiple times, i.e. throughput of 0.5
microop fusion is where a microop doesn’t count for your penalty as
it’s fused with another. with combined memory ops,
e.g. vsubps ymm8, ymm3, ymmword ptr [rdx]
this is the case
so, if a compiler were to separate this out into a mov and then a sub,
not only does this put unecessary strain on the front-end decoder it
also removes microop fusion as they are now separate microops (important
to point out that I’m not the world best optimiser, or the worlds best
optimisers assistant, so perhaps best not to outrightly say bad codegen,
just say makes nervous)
godbolt.org good for comparing compiler outputs and possible detecting a spurious load etc.
macrop fusion is where you have an instruction that the front-end will handle for you, e.g. add and a jne will merge to addjne which will just send the 1 microp of add through
uica.uops.info gives percentage of time instruction was on a port (this is useful for determining bottle-necks, e.g. series of instructions all require port 1 and 2, so cannot paralleise easily) so, although best case say is issue instruction every 4 cycles, this bottleneck will give higher throughput
some levels of abstraction are necessary and good, e.g. higher level languages to assembly
Optimise: gather stats -> make estimate -> analyse efficiency and performance
file size https://justine.lol/sizetricks https://codegolf.stackexchange.com/questions/215216/high-throughput-fizz-buzz/236630#236630
more important to understand how CPU and memory work than language involved in an OS, you will get given a zero-page due to security concerns
likely() macros for branch prediction compiler optimisations (https://akkadia.org/drepper/cpumemory.pdf, pg 56)
recording information: We want to understand where slow with vtune, amd uprof, arm performance reports Next, determine if IO bound, memory bound etc.
To determine performance must have some stable metric, e.g. ops/sec to compare to e.g measure total time and number of operations Hyper-threading useful in alleviating memory latency, e.g. one thread is waiting to get content from RAM, the other hyper-thread can execute However, as we are not memory bound (just going through pixel by pixel and not generating anything intermediate; will all probably stay in L1 cache), we are probably saturating the core’s ALUs, so hyper-threading not as useful
Inspecting the assembly of our most expensive loop, we see that rand() is not inlined and is a call festival. This must be replaced Essentially we are looking for mathematical functions that could be inlined and aren’t that are in our hot-path. When you want something to be fast, it should not be calling anything. If it does, probably made a mistake Also note that using SIMD instructions, however not to their widest extent, i.e. single scalar ‘ss’. Want to replace with ‘ps’ packed scalar
we have the option of constructor/destructor pairs if we want to determine best possible time if all caches align etc. ‘hunt for mininum’, e.g. record mininum time execution in loop iteration or re-run if smaller time yielded alternatively, we could develop a statistical breakdown of values (could see moments when kernel switches us out etc.)
(IMPORTANT save out configuration and timing information for various optimisation stages, e.g. ./app > 17-04-2022-image.txt)
agnerfog optimise website ‘what’s a creel’
threading Observe CPU percentage use is not close to 100% For
multithreading, often have to pack into 64 bit value to perform single
operation on it,
e.g. delta = (val1 << 32 | val2); interlocked_add(&val, delta)
When making multi-threaded, segregate task by writing prototype
‘chunk’ function, e.g. render_tile
Then write a for loop
combining these chunk functions Before entering the chunk function, good
to have a configuration printout, e.g. num chunks, num cores, chunk dim,
chunk size, etc.
When dividing a whole into pieces, an uneven divisor will give less
than what’s needed. so (total + divisor - 1) / divisor
to
ensure always enough. We will want this calculation to be in the last
dividing operation, e.g. tile_width then tile_count calculated, so use
on tile_count Associated with this calculation is clamping to handle
adding extra exceeding original dimensions For getting proper place in
chunk, call function wrapper for pointer location per row
(May have to inline functions?) Next we want to pass each chunk onto
a queue and then dequeue them from each logical core? So, have a
WorkOrder that will store all information required to perform operation
on chunk, i.e. all parameters in render_chunk
function (may
also store entropy for each chunk, i.e. random number series) Then a
WorkQueue that contains an array of WorkOrders with total number
equalling number of chunks So, the original loop iterating of chunks now
just populates the WorkOrders Now in a while loop that runs while there
are still chunks to execute, we call the render_chunk
function and pass in the WorkQueue The render_chunk
function will increment the next_work_order_index and return true if
more to be done
When spawning the actual worker thread functions, have same while
loop calling the render_chunk
function as for core 0 (the
amount of threads to spawn would think should be equal to number of
logical cores? however exceeding them may increase performance?) (this
debate of manually prescribing the core count applies to the chunk size
as well. perhaps the sweet-spot for my machine in balancing context
switching and drain out is to manually prescribe their size as oppose to
computing them off the core count) (Collating information into the
WorkQueue struct helsp for printing out configuration) (Setting up this
way, we can easily turn multithreading off)
As creating threads will require platform specific, put prototypes in main.h and the implementations in linux_main.cpp Then include linux_main.cpp based on macro definition of platform in the build script at the bottom of main.cpp
hyperthreading, architecture specific information becomes more important when in a situation where memory is constrained in relation to the cache (hyper-threads share same L1-L2 cache)
volatile says code other than what is generated by this compiler run, could modify this value. it’s required for multithreaded, as compiler may not re-read value that it may have cached in a register if changed elsewhere when incrementing volatiles, must use a locked_add_and_return_previous_value (could return new value, just be clear)
simd clamp can be re-written as min() and max() combination, which are instructions in SSE Although looking at the system monitor shows cpus maxed out, we could be wasting cycles, e.g. not using SIMD
Define lane width, and divide with this to get the new loop count Go through loop and loft used values e.g. lane_r32, lane_v3, lane_u32 (IMPORTANT at first we are only concerned with getting single values to work, later can worry about n-wide loading of values) (TODO the current code has the slots for each lane generated, rather than unpacked. look at handmade hero for this unpacking mode) If parameters to functions, loft them also (not functions? just parameters? however we do random_bilateral_lane() so yes to functions?) If using struct or struct member references, take out values and loft them also, e.g. sphere.radius == lane_r32 sphere_r; (group struct remappings together) Remap if statement conditions into a lane_u32 mask and remove enclosing brace hierarchy (IMPORTANT you can still have if statements if they apply to lanes, e.g. if mask_is_zeroed() break;) (TODO for mask_is_zeroed() we want the masks to be either all 1’s or 0’s) (call mask_is_zeroed() on all masks to early out as often as possible to get a speed up) Once lofted all if statements, & all the masks into a single mask (it seems if there is large amounts of code inside the if statements, you don’t want to do it this way and rather check if needing to execute?) (IMPORTANT to only & dependent masks, e.g. if there is an intermediate if like a pick_mask or clamp, then don’t include it, but do the conditional assign directly on this mask) Then enclose remaining assignments in a conditional assignment function using this single mask? (conditional_assign(&var, final_mask, value); this uses positive mask to get source and negative mask to get dest?) (also discover the work around to perform binary operations on floating point numbers) So, by end of this all values operated upon should be a lane type? (can have some scalar types if appropriate) We may have situation where some items in a lane may finish before others. So, introduce a lane_mask variable that indicates this. To indicate say a break, we can do (lane_mask = lane_mask & (hit_value == 0)); For incrementing, will have to introduce an incrementor value that will be zeroed out for the appropriate lane item that has finished. Have horizontal_add()? Next once everything remapped create a lane.h. Here, typedef the lane types to their single variants to ensure working before adding actual simd instructions Also do simd helper functions like horizontal_add(), mask_is_zeroed() in one dimension first Wrap the single lane helper functions and types in an if depending on the lane width set (IMPORTANT any functions that we are to SIMD, place here. if it comes that we want actual scalar, then rename with func_lane prefix)
for simd typically have to organically transition from AOS to SOA
Debug in single lane, single threaded mode (easier and debugger works) However, can increase lane width as needed (threading not so much?)
For bitwise SIMD instructions, the compiler does not need to know how
we are segmenting the register, e.g. 4x8, 8x8 etc. as the same result is
obtained performing on the entire register at once. So they only provide
one version of it, i.e. no epi32 only si128 Naming convention have
types: __m128 (float), __m128i (integer), __m128d (double)
and names in functions:
epi32/si128 (integer), ps (float), pd (double)
Overload
operators on actual wide lane structs (IMPORTANT remember to do both
orders, e.g. (val / scalar) and (scalar / val)) Also have conversion
functions
Lane agnostic functions go at bottom (like +=, -=, &=, most v3 functionality) (IMPORTANT it seems we can replace logical && and || with binary for same functionality in simd)
(IMPORTANT simd does not handle unsigned conversions, may have to cut off sign bit, e.g. >> 1)
process of casting type to pointer to access individual bytes or containing elements (used in file reading too)
(IMPORTANT masks in SIMD will either be all 1’s or 0’s. perhaps have a specific name for this to distinguish?)
(IMPORTANT seems that not all operations are provided in SSE, like !=, so have to implement with some bitwise operations)
SIMD allows divide by zeros by default? (because nature of SIMD have to allow divide by zeroes?)
To get over the fact that C doesn’t allow & floating point,
reinterpret bit paradigm *(u32 *)&a
as oppose to cast
(IMPORTANT in SIMD cast is reinterpreting bits, so the opposite of cast
in C)
caching https://akkadia.org/drepper/cpumemory.pdf 1. know cache sizes to have data fit in it 2. know cache line sizes to ensure data is close together (may have to separate components of structs to allow loops to access less cache lines) i.e. understand what you operate on frequently. may also have to align struct 3. simple, linear access patterns (or prefetch instructions) for things larger than cache size
inline assembly (raw syscalls from github) HAVE TO INSPECT/VERIFY ASSEMBLY IS SANE FIRST THEN LOOK AT TIMING INFORMATION inspecting compiler generated assembly loops, look for JMP to ascertain looping condition due to macro-op fusion (relevent to say Skylake), e.g. cmp-jmp non-programmable instructions could be executed by the cpu similarly, instructions that only exist on the frontend but exist programmatically e.g. xmm to xmm might just be a renaming in register allocation table also due to concurrent port usage, can identify parts of code as relatively ‘free’ struct access typically off a [base pointer] in assembly, 1.0f might be large number e.g. 1065353216 in assembly loop, repeated instructions may be due to loop unrolling we might see: * superfluous loading of values off stack * more instructions required, e.g not efficiently using SIMD (often this exposes the misconception that compilers are better than programmers; so better to handwrite intrinsics)
comparing unoptimised assembly to ‘wc’ see noticeable speed increase. example of non-pessimisation
Following the basic principles of non-pessimisation, I make a note of
the huge amount of cruft in the C STL. The output buffering, hidden
malloc()
‘optimisations’ (uncommitted memory, encounter
expensive page faults later; prefer reliability/clarity over
edge-performance benefits), OS line ending conversions, non-obvious use
of mutexes etc. Whilst these may seem like minor inconveniences, they
can be insidious for performance, e.g. rand()
has a huge
call-stack that if we replace with a simple xor shift, results in 3x
speed up. Although easy to criticise, it may be the situation that the
CRT had to be that way because of C standards. To isolate use of the
CRT, wrap in functions so we can hopefully replace with system calls,
intrinisics, etc. Although generally okay to use STL, it forces you to
use their patterns (e.g. memory allocation, locks etc.). This is true
for STLs in general across all languages. Some may have bloatness from
other areas, e.g. C++ templates To avoid the compiler having to generate
a large export table of all functions, make them static
To
avoid large amounts of linking and ∴ increase compilation time, have a
unity build. Furthermore, garunteed ability to inline functions (as with
multiple translation units, possible one might only have function
declaration and not definition) (issues may occur with slower
incremental builds when including 3rd party libraries; yet, can still
work around this possibly using ccache)
https://marketplace.fedevel.education/itemDetail.html?itemtype=course&dbid=1569757838995&instrid=us-east-2_KpwYC7yK5:45f6c01d-ccc8-43e0-8f33-c5a70caf707f
when creating lid lips, factor in printer tolerance, say 0.4mm
Spreadsheet (have dimensions to make parametric): if want to access
parameteric sketch constraint, <
1.75mm PLA (polylactic acid) into 0.4mm nozzle
0.15mm layer height speed/quality 15% infill gyroid integrity/speed/usage
supports required for model that does not print over itself
export g-code (like printer ISA?) to sd card
supports necessary for complex geometries like overhangs, greater than 45°, etc. put supports everywhere by default? (perhaps add blockers to this?) maybe just use everywhere to get a feel for it, then manually add enforcers?
can also use printrun usb-b interface to print without SD card?
most solder has flux core (typically rosin) to remove oxide films, i.e. wetting the metal (to remove dirt/grease will require cloth or steel brush) Sn/Pb (60/40) lower boiling point and shinier finish (cone shaped) then non-leaded. fumes are flux as boiling point of lead (≈1700°C) much higher fume extractor at top |
---|
Don’t restrict right side of bell curve Let your aces be aces Being an ace involves having an opinion Most influential software written largely by one person, e.g Linux, Unix, git etc. Then a team is assigned to maintain it. Fallacy about solo programmer productivity requiring large teams. Design by committee pushes design to middle of bell curve as opposing views average out |
Cpu try to guess what instructions ahead (preemptive). Cost of incorrect reflushing expensive. So want to get rid of conditional jumps. Ideally replace with conditional movs or arithmetic branch less techniques. Endianness (register view), twos complement (-1 all 1s) Branch less programming is essentially SIMD |
What simd instructions available to cpu? If variable clock speed, cpu could detect not using all cores and increase single core clock |
Flops calculated with best instruction set? Not memory bound is best case for hyper threading Intel speeds optimised for gpr arithmetic, boolean and flops Why is Intel shr instruction so slow? Intel deliberately makes mmx slow |
Polymorphism is a single object that can be interpreted as having various types. This can simply be a struct with unions and a type field.
Never use setters/getters unless actually doing something. You’re spending your entire day typing. If needing, replace variable name with name_ to see where it was used.
You need to be self critical to be a good engineer
Caches are a way of minimising roundtrip time of RAM by putting memory as close to the core that thinks will need it L1 closest, 1/2 cycles, 32k. Wikichip for more info Cpu will go to caches first
L1 can supply 2 cache lines per clock Instructions per clock, number of work components, e.g. number of add components, cache line per cycles, cache latency, agu (address resolver units) impose restrictions A cache miss is simply stalling for an instruction. However, this may not be an issue if we do other work, e.g. complex algorithm takes many instructions hiding memory access for out of order cpu. If hyper threading with two schedulers, if cache miss on one, just switch to the other. Can really only know if a cache miss incurs a performance penalty by looking at raw numbers from vtune, etc. Because of the scheduler, it’s not as simple as just looking at memory sizes So, due to complex overlapped/scheduling nature of modern CPUs can really only know if cache miss incurs penalty with vtune Uop website displays table for instructions
Currently good that most things are little endian with 64 byte cAche lines, however some hardware guys Is going to come along and change it back to big endian
Should be code of ethics in software to not create bugs/inconveniences for users that couldve been avoided
An instruction of throughput 1 means issued every clock. As many instructions take longer than 1 cycle, each core requires a scheduler to see if it can execute something. View cpu as sections where there is some distance to communicate.
Making some thing good takes time. However, if you have crazy design practices it will also take longer You have to be reality based when programming. That is in an engineering sense, to design something that solves the problem you have People become attached to a way of programming which doesn’t focus on solving the problem. They want to build rube-goldberg machines Selectively attacking problems seriously means you have a functional program quicker, whereby you can actually decide if those other problems need to be addressed. Can defer hard decisions later as they will be made better as you will have more technical expertise and more context to work with
Testing is important. If you don’t write tests, your software doesn’t work. However, write higher level system tests, not excessive unit tests. More efficient and this is where bugs are likely to be. You often have to remove code, so having unit tests just increases the volume of code you have to write. Huge drain on productivity. Maybe for NASA. A new paradigm should weigh up cost-benefit. Almost always the cost is ignored and people gobble them up
To make computers better to use, have to simplify them on all levels
Faster cpu like Apple’s m1 are irrelevant if software bad
Floating point math faster than integers
my style of programming and problems enjoy solving found in embedded, e.g your constrained with the silicon not like in web where you just build another data centre
Compiler works on file by file, so knows nothing about calls across files. Therefore it generates object files which are partially executable machine code with unresolved symbols. Linker merges these object files and also adds extra header information so that the OS can load our executable (or more specifically a kernel, e.g. linux)
Complete code coverage on the one hand is very thorough, however don’t get a lot of engineering output. Furthermore, most bugs appear in between systems not in units.
Best way to test is to release on early access. This checks hardware and software, user may be running adobe acrobat which hogs cpu so instruct them to kill it before running your game. Or maybe 20000 chrome plugins. This is something a hardware lab can’t tell you
Process is allocated virtual memory space OS has mapping table that converts these to physical addresses. Part of our processes address space is pre-populated by the kernel program loader, e.g. linux-vdso, environment variables, etc. Kernel tunables: sudo sh -c ‘echo kernel.perf_event_paranoid=0 >> /etc/sysctl.d/local.conf’ (sysctl) User tunables: ulimit -a In virtual address space, have user space and kernel space address ranges. A virtual address is mostly relating to page table indexes and the last bit is a static offset (as for security addresses are randomized)
To make an installer just fwrite your executable and then data files appended with footer. Inside the exe, fread the exe and fseek based on the static offset of the appended resources Bake resources in for reliability only really
Packed files better as less OS operations performing expensive file handles etc.
Programming about solving problem. overlooked by design philosophies. If you don’t have any functionality, you don’t have a structural problem
const is only useful if you find it catches bugs for you (maybe for globals instead of using #defines) however, in terms of optimsations, const is useless as you can cast const away. therefore, for me, const is mostly just a waste of typing. however, have to use for strings in C++ In a similar vein, VLAs useful here (note that sizeof(array) and sizeof(pointer) for calculating string array count)
distinct areas of memory in assembly are stack, heap and data (globals)
direction of stack growth is often determined by CPU. if selectable, then OS. eg. x86 is downwards ulimit -s for stack size (main executable will have stack size listed in headers)
good practice to assign variable for syntatical reasons, i.e. more readable, e.g Controller *controller = &evdev_input_device.controller;
getconf can list POSIX system configurable variables
don’t go through a ton of unecessary stdlib. malloc probably more optimised for small memory size requests relative to overall code base, interfacing with system calls is not that much code (and can be reused)
if we try to access memory from an invalid page (not reserved or comitted) will get segfault. only have to worry about errors that don’t manifest themselves on every run of the program.
on x86, writing by 32bit faster than 8bit as less instructions in general, the fastest way is to use the widest possible register that can be operated on at once the speed of accessing the memory from cache is pretty cheap for nearby regions
don’t make changes for conceptual cleanliness. end of the day, want to make performant, bug free code in the shortest amount of time.
when programming some days you are off. this just means you’re going to be debugging a lot
we want it to be clear what our code can and cannot touch. global variables make this hard (however, can add _ to see where they are all used) however, as many OSs are rather janky and most code will live outside this, it is ok to have some globals here globals are fine in development. can repackage into a structure later.
clock speed not as relevent as improvements in microarchitecture and number of cores means can be more efficient under less duress. also, lower clock speed may be because want to draw less power.
short build times (under 10 seconds) are incredibly important to not decentivness making changes and testing them
function overloading, generalised operator overloading and default arguments are c++ features that can’t easily be implemented with gcc extensions
note that >> will typically (implementation defined) perform arithmetic shift (fill in with 1’s) on signed, so not always the same as a divide. similarly, sign-extension just fills in the new MSBs with 1’s
for large cross-platform projects, best to differentiate with filenames, rather than ifdefs. this also gives the ability to have different control flows across the different platforms (essential)
for a game, better to have the game as a service to the OS (not the other way round). this is becuase the game does not need to know/perform the myriad of possible operations the OS can perform.
most modern cpus have a floating point unit, making them faster than ints (same latency), e.g. a multiply is one instruction where ints is two (multiply and shift) x87 is the FPU instructions for x86 (also have SSE instructions which is want you ultimately want) however, for multiplayer games, optimisers can give different results when using floating point, e.g a platform that has operator fusion like a MULADD may give different result when rounding then a platform that has do it separate. (fixed point could solve this)
Programming Mentality: always important to know when coding what is your goal. premature optimisation and design are bad. your goal dictactes the quality of the code you write, e.g. allowed to be janky as first pass on API. write usage code first.
often when having a variable lengthed array, ask ourselves do we really need that?
asserts are part of debug program that are used to check that things work that should always work. use them for a condition that must be true that is not explicitly present
don’t think about memory management, but rather memory usage. if having to worry about freeing etc. done something wrong.
when writing ‘spec’ code, no need to handle all cases; simply note down the edge cases you should handle later on stop yourself thinking about whether the code is messy or not. only care once problem is solved!
it’s not the programming practice but the dogma that gets you. when you start to name things it almost always becomes bad. almost all programming practices have a place, just not used often so RAII people, in case of things that must be released, e.g DeviceContext, ok to use a constructor/destructor pair. I’ll throw you a bone there
streaming i/o is almost never a good choice (hard drive slowest, more errors)
compression orientated programming is you code what you need at the time (breaking out into function, combining into struct, etc.) over time the code marches towards a better overall quality
amdahls law gives the time taken for execution given a number of cogives the time taken for execution given a number of cores. for this formula (indeed any formula) we can obtain some property by seeing as function parameter approaches infinity. in this case, the parallelising part drops out. brooks law says that simply adding more people to a problem does not necessarily make it faster. if requires great deal of coordination/communication actually slows down.
solving a problem: 1. decide what you are doing (this can’t be open-ended.) 2. organise groups to achieve this by making these boundaries, we are presupposing that each part is separate, e.g tyres team and engine team; assume tyres and engine cannot be one piece. therefore, the boundaries define what products you can make, i.e. you produce products that are copies of yourself or how you are structured so, in software if we assign teams for say audio, 2d, 3d we would expect individual APIs for each. the org chart is the asymptote, so it’s the best case that we make a product as granular as our org chart. it could be far worse and even more granular therefore, communication between teams is more costly than communication within teams. takeaway is that low-cost things can be optimised, high-cost can’t be (further away on the org chart) note that communication in code could just be someone checking something in and you pulling it what we are seeing now with modern software is the superposition of orgcharts due to use of legacy codebases now we see org charts in software, where people are artifically creating inheritence hierarcies that limit how the program works this is very bad. the reason it’s done is for people to create mental models that help them solve the problem as they can’t keep the complexity in their head. it may be necesary to solve the problem, however it shouldn’t be looked at as good. however, because it’s done due to lack of understanding, the delegation/separation is not done with enough information. so you limit possibilities of the design space. so although, libraries, microservices, encapsulation, package managers, engines may be necessary due to our brain capacity (until neuralink or we figure out a better way to do them) they are not good! we may use hash map, but only in a particular way They limit optimisation as we have already decided separation so always be on the lookout for times when you don’t have to do these most people just download hundreds of libraries because they know it works and they won’t be worse than any one else. WE MUST BE LEAN AND FLEXIBLE IN ORGCHARTS IN COMPANY AND IN SOFTWARE TO INCREASE DESIGN. some old codebases need to be retired
DO THE ‘MOST CERTAIN’ THING FIRST. THIS COULD EITHER BE THE IMPLEMENTATION OR THE USAGE CODE choose data structures around solving problem
some software is scaffolding, i.e. not shipped with the final product, e.g. editor for games
To make anything alternate over time, just multiply by sine(time);
Data hiding hides what the CPU is doing, which is what we care about
Require machine-specific documentation files to understand system we are on System specific ctags template projects, e.g. linux kernel, glibc, etc. If using library, have ctags for that project
ALMOST ALWAYS CAST TO FLOAT WHEN DOING DIVISIONS LEADING TO FLOAT
Minimum value starts at max
Spreading out randomness:
final_value += contrib * sample
If debug code (or code that will not be in release) use compile-time macros
Use GLOBAL and global_prefix If casting is occuring, always be explicit about it! Prefixing functions with sdl2_func() or linux_func() only if intending to be cross-platform
With error handling, bad practice is to allow a lot of errors, which brings in error classes etc. Instead, if it’s something that is actually an error, e.g. missing file, write the code to explicitly handle it. Handling the error in a sense makes it no longer an error, rather a feature of the program
if function is expecting a range between, should we clamp to it?
Refactoring Mentality: Refactoring is essential. You must know what you are trying to achieve so you have some notion of progress, e.g. adding a constraint to the system. (Replacing variable type in scope gives us all locations where variable was used, including macros. This us where dynamic lnguages fail as you don’t know if you broke something)
You can abstract/encapsulate anything at anytime, so why not do it later when you know what you are doing? We want file format to be simple and binary, unlike json which is general purpose and string
modifying (parse as pointer arg), returning (result struct). it is best to put simple types are arguments rather than a group struct to allow for maximum code reuse. only put group struct as arguments if must be put together (in general don’t pack so don’t force user to create a struct when they may already have the values)
if can go functional without sacrificing, saves you complexity down the road. oftentimes simply utilising elements of functional programming is good, e.g. no global state, only operating on parameters etc.
writing code guides you to the right design, e.g. made same call -> put into function; require args in function -> struct; many related functions together -> organise related functions into own file; (if thinking could be moved to another file, make comment sections outlining code blocks to ease this process later on) etc. (these are simple, compression changes), complex api -> transient struct, overload functions for internal/external use There are also large changes that are more difficult, e.g. sections of single value interpreted differently -> pack 2 variables into 1 (_ technique useful here) same operations performed on pairs of variables -> vectors (as oppose to working with them as scalars) vectors are particularly useful as without them repeated actions quickly become intractable It can get messy at times, but always know that a clean solution is out there and you will refine towards it Work threw the error tree one at a time Important to throw in asserts for underlying assumptions. Also for debugging be aware of ‘copy-pasta’, e.g. copying variables will have same name for two parameters as didn’t replace it
we don’t want to orient our code around objects (if anything, algorithmic oriented). its about how you arrive at some code that determines how good it is
excessive pre-planning and setting waypoints is really only useful if you’ve done the problem before (which in that case you’re not really designing) instead, we become a good pathfinder, learning to recognise the gradient of each day of the journey. when write the simplest thing and loft it out into good design later (in this explorative phase, if we make an changes for efficiency reasons we have just introduced the possibilities of bugs with no benefit)
only break into function when you know what you want, e.g. called multiple times or code finalised and improved readability (in which case a tradeoff is made between understanding functionality and semantics)
Don’t be scared of mass name changing!! Before doing so, see all places where name is used Don’t be scared of long list of compiler errors. Work your way through them
Refactoring with usage code: just write out structures that satisfy the usage code. If major rewrite use #if 0 #endif to allow for successful compiling
refactoring just copy code into function that gets it to compile.
later, worry about passing information in as a parameter basic debug and
release compiler flags When refactoring, utilise our vimrc
Debugging Mentality: debugging stepping through pass-by-pass. inspecting all variables and parameters and verifying state of particular ones. make deductions about state of variables, e.g overflowed, uninitialised, etc. drop in asserts draw it out
being able to draw out debug information is very useful. time spent visualising is never wasted (in debugger expressions also)
When debugging, look through variables and see if anything looks ridiculous
When in debugger, go iteratively progress through variable values in function and see if they look right We can isolate some area of the code and say this is probably the problem Then investigate relevent sub-functions, etc. This can be a long process with seemingly little gains. The issue could be subtle, e.g. signed/unsignedness size, function called rarely
Configuration files should be copied, not generated (becomes too messy) Symlink to template files from projects
To begin, I ensured that I had a debugger from which I could easily step through the application’s execution. In code, I was able to programmatically set system and user breakpoints.
(mocking of syscalls for unit testing with file i/o)
For handling non-fatal errors, single line check. For fatal errors, nest all preceding code (I have learnt to not be afraid of indentation in this manner). (error handling in general, i.e. reduce ‘errors’ by making them part of normal execution flow)
When performing the common task of grouping data, a few practices to keep in mind. Use fixed sized types to always know about struct padding (in fact, I like to extend this to all my code)
If wanting multiple ways of accessing grouped data, use union and
anonymous structs. Use an int to reference other structs,
e.g. plane_index
If the data being grouped can only exist together (e.g. points), use vectors. Put all structs related typedefs inside their own header file for easy access.
As floats are an approximation, when comparing to 0.0f (say for a
denominator check) or negative (say for a square root) use a
tolerance/epsilon less-than/greater-than check. In fact whenever
dividing should always ask oneself “can the value be zero?” To be clear
about float to int casting, use a macro like truncate/round (think about
what if uneven divide) Due to mixed integer and float arithmetic going
to float, calculate integer percentages val * 100 / total
There is no need to overload the division operator as can do
(* 1.0f / val)
For easy substitution, use single letter prefix names like
output_h
and output_w
. Convention for variable
arrays, e.g. Planes plane[1]
, planes
and
plane_count (use ARRAY_COUNT macro here)
Put for loop
statements on separate line to help not be afraid of indentation.
Iterate over pixel space and then convert to say, world space for
calculations (normalisation and lerp) Aspect ratio correction is simply
rearranging a ratio. If we determine one side is larger, scale other.
Use “ ASCII code to print a status indicator. Only use const for char *
string literals stored in the data segment.
Endianness comes into play when reading/writing from disk (e.g. file
type magic value) and working directly with u8 *
(e.g. iterating through bytes of a u32)
MARKETING APP: f5bot.com, https://github.com/lawxls/HackerNews-personalized I’m notified when keywords related to “human wants thing, my app can do thing” appear on HN, Reddit and Lobsters. If I can then contribute with information to that discussion, I’ll also leave a link to my app. Don’t just self plug, people (myself included) appreciate more detailed information on how they can solve their own personal problem, instead of being thrown into “here’s an app, figure it out”
even parity is to make it even, i.e. so if 5 1’s, even parity will add a 1