pthread
CS 271
Prof. Calvin
08 Apr 24
wDd0
Announcements
- "no step on snek": Linkaroo
- Right now: You should have some networking code working.
- The sample binary and recommended fcntl implementation is bad and wrong in a way we will learn to improve on today.
- Blah blah blah C bad whatever just write code.
Today
- Fork()
- Hacks
- Sleep
- Pthread
- function pointers
- arg structs
- create/exit/join
- wordcount
C Bad
- Okay we can agree this is awful right.
char buf[20];
fcntl(0, F_SETFL, fcntl(0, F_GETFL) | O_NONBLOCK);
sleep(4);
int numRead = read(0, buf, 4);
if (numRead > 0) {
printf("You said: %s", buf);
}
- "sleep(4)" means do nothing for 4 seconds.
- "read(0, buf, 4)" means do nothing until you can read from a buffer.
- But "fcntl(...)" changes the meaning of BOTH
- It changes sleep to end early if a read completes (that is, sleep for UP TO 4)
- It changes read to end early if sleep completes (that is read whatever shows up within 4 seconds, including nothing).
- But we have two different lines of code that kinda run at the same time.
- C bad. C bad!
Fork()
"If there's a fork in the road, take it." -computers
- If C is gonna do two things at the same time, it should be less sketchy and use like, code blocks.
NAME
fork - create a child process
SYNOPSIS
#include <sys/types.h>
#include <unistd.h>
pid_t fork(void);
DESCRIPTION
fork() creates a new process by duplicating the calling
process. The new process is referred to as the child
process. The calling process is referred to as the
parent process.
- Okay how does this thing work.
Fork()
- Let's use it...
#include <sys/types.h>
#include <unistd.h>
#include <stdio.h>
int main()
{
if(fork())
{
printf("Tis I, the elder and more terrible process.\n") ;
} else {
printf("Tis I, the more youthful and novel process.\n") ;
}
return 0 ;
}
- "If there's a fork, take it"
user@DESKTOP-THMS2PJ:~$ gcc text.c ; ./a.out
Tis I, the elder and more terrible process
Tis I, the more youthful and novel process
user@DESKTOP-THMS2PJ:~$
Fork()
- Sleep and read at the same time.
void main() {
char buf[20];
if(fork()) {
sleep(4) ;
} else {
int numRead = read(0, buf, 4);
if (numRead > 0) {
printf("You said: %s", buf);
}
}
}
- This is a bit cleaner.
- No matter what, the program exits after 4 seconds.
- If there's input text (type and hit enter) it is returned.
- There's codeblocks splitting up the execution clearly.
- Basically after fork() two different programs run, elder and youth.
- They can do things "at the same time".
- When elder ends, they both end.
- If you negate fork, then it doesn't work (elder must sleeper).
Fork()
- The elder and the youth know not how to share.
int x = 7 ; // worlds greatest int dont @ me
if(fork()) {
x += 1 ;
wait(NULL) ; // wait for youth, sys/wait.h
} else {
x += 2 ;
}
- x never gets to 10
user@DESKTOP-THMS2PJ:~$ gcc text.c ; ./a.out
x = 9
x = 8
user@DESKTOP-THMS2PJ:~$
- Using wait() to determine precedence has pros and cons.
- Not sharing has pros and cons.
- There's ways to share info (like sockets of course) but do we want to do that.
Fork()
- For the worst thing you've ever seen in your life, share memory with a file.
if (fork()) {
for ( ; 1 ; i = (i + 1) % 26 ) {
c = 'A' + i ;
fopen(FNAME, "w") ; fwrite(&c, 1, 1, fp) ; fclose(fp) ; sleep(SLEEP) ;
}
} else {
for ( ; 1 ; ) {
printf("%c\n", c) ;
fopen(FNAME, "r") ; fread(&c, 1, 1, fp) ; fclose(fp) ; sleep(SLEEP) ;
}
}
- This is a good way to find out why you need to null check system calls...
- But if you try it a few times it'll probably not break instantly at least.
- Sleep before fclose to cause disasters at high probability.
Okay but how are we supposed to wait around to read something for a fixed amount of time?
Today
- ✓ Fork()
- ✓ Hacks
- ✓ Sleep
- Pthread
- function pointers
- arg structs
- create/exit/join
- wordcount
<pthread.h>
Basically fork() was the worst so UNIX/POSIX invented pthreads
NAME
pthread_create - create a new thread
SYNOPSIS
#include <pthread.h>
int pthread_create(pthread_t *thread, const pthread_attr_t *attr,
void *(*start_routine) (void *), void *arg);
Compile and link with -pthread.
DESCRIPTION
The pthread_create() function starts a new thread in the calling process.
The new thread starts execution by invoking start_routine(); arg is passed
as the sole argument of start_routine().
- Okay how does this thing work.
Function pointers
void *(*start_routine) (void *)
- That is a pointer to a function accepts a void * argument and has a void * return value.
- Here's an example:
void *func( void *ptr ) {
while(!sleep(1)) {
printf("%d\n", *(int *)ptr ) ;
}
return NULL ;
}
- Here's how we'd make a variable that describes func()
void * (*fptr)(void *) = &func ;
- When we make a pthread, it needs somewhere to start executing - kinda like main().
- With fork(), execution just followed the fork() call - again, pros and cons.
- Void * argument and return allows us to
- Use a pointer or struct to hold an arguments or return any values
- Use casts to read from the arguments or return value.
printf("%d\n", *(int *)ptr ) ;
pthread_create
pthread_create( &tid, NULL, &func, (void *) &val ) ;
- Imagine we have func, which prints its argument every one second.
- Here's how we set up a pthread to run func.
void main() {
pthread_t tid ;
int val = 0 ;
pthread_create( &tid, NULL, &func, (void *) &val ) ;
while(!sleep(1)) { val++ ; }
}
- pthread_create has three arguments:
- Where to store the thread id
- Some options, which we will deal with latter or never
- The big spooky function pointer #ominous
- The arguments as a void *, usually casted from a meaningful data structure or data type.
- This code then keeps increasing val, and we can observe what happens...
Pthreads
- Pretty unlikely to get numbers exactly in order, and that's okay.
user@DESKTOP-THMS2PJ:~$ gcc test.c ; ./a.out
0
2
3
4
4
6
7
8
9
10
- There's way to synchronize this (out of scope). "man -k pthread_spin"
- This is the cool, good, fun way to do things.
- C good!
exit/join
int socket(AF_INET6, SOCK_STREAM, int protocol);
- When you create a pthread, it runs until the whatever created the thread terminates.
- Sometimes, we want to run until the last pthread is done with whatever it's doing.
- We achieve this with pthread_join and pthread_exit.
int pthread_join(pthread_t thread, void **retval);
void pthread_exit(void *retval);
- We can think of pthread_exit similar to stdlib exit() - it's a way to end the thread, rather than the program.
- We can think of pthread_join similar to wait() - it's a way to keep the caller around until a callee finishes their job.
- Let's see an example that I tricked ChatGPT into making. It's a little scuffed, but fun.
Today
- ✓ Fork()
- ✓ Hacks
- ✓ Sleep
- ✓ Pthread
- ✓ function pointers
- ✓ arg structs
- ✓ create/exit/join
- wordcount
Word Count Problem and Pthreads Solution
- The word count problem involves counting the number of words in a given text file. To efficiently solve this problem, we can utilize multiple threads with the pthreads library.
- Using pthreads allows us to divide the file into smaller chunks and assign each chunk to a separate thread for processing. Each thread independently counts the words within its assigned chunk, and the individual counts are later combined to obtain the total word count.
- This approach leverages parallel processing, enabling faster execution compared to a single-threaded solution, especially for large files. By utilizing pthreads, we can efficiently tackle the word count problem by distributing the workload across multiple threads, resulting in improved performance.
Include Libraries
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <ctype.h>
- First, the code includes necessary libraries for pthreads, file operations, and standard input-output.
- These libraries provide functions for thread creation, file handling, memory allocation, and printing to the console.
#define MAX_THREADS 4
#define BUFFER_SIZE 1024
- Constants are defined for maximum threads and buffer size.
- These constants are used to control the number of threads and the size of the buffer for file reading.
Define Thread Data Structure
struct thread_data {
char *buffer;
int start;
int end;
int word_count;
};
- A structure is defined to hold data for each thread.
- It includes a buffer to store file content, start and end indices for processing, and a word count.
- Quick, how many bytes is thread_data?
Thread Function
void* count_words(void *arg) {
struct thread_data *data = (struct thread_data*)arg;
for (int i = data->start; i < data->end; i++) {
if (isspace(data->buffer[i])) {
data->word_count++;
}
}
return NULL;
}
- The thread function 'count_words' counts the number of words in a given range of the buffer.
- It iterates through the buffer and increments the word count when encountering whitespace characters.
- Quick, what is the type of count_words? Why?
- ChatGPT uses "isspace" here - it's counting spaces, not words.
Main Function
int main(int argc, char *argv[]) {
if (argc != 2) {
printf("Usage: %s <filename>\n", argv[0]);
return 1;
}
FILE *file = fopen(argv[1], "r");
if (!file) {
printf("Could not open file.\n");
return 1;
}
// Remaining code omitted for brevity...
}
- The main function is the entry point of the program.
- It takes a filename as an argument and opens the file for reading.
- If the file cannot be opened, it prints an error message and exits.
- If ChatGPT can null-check system calls, so can you.
Allocate Memory
char *buffer = (char*)malloc(file_size);
if (!buffer) {
printf("Memory allocation failed.\n");
fclose(file);
return 1;
}
- Memory is allocated for the buffer to hold the file content.
- If memory allocation fails, an error message is printed, and the program exits.
- I would never just rip an entire file into a malloc, since that seems mean to the computer, but this is a computer doing it to another computer so it's okay.
- Quick, how do you do this without reading the entire file?
Read File Content
fread(buffer, 1, file_size, file);
fclose(file);
- The file content is read into the buffer using fread.
- If the read operation fails, an error message is printed, and the program exits.
- ChatGPT just stopped checking the return values of system calls here. Your jobs are safe.
Create Threads
pthread_t threads[MAX_THREADS];
struct thread_data thread_data_array[MAX_THREADS];
int start = 0;
for (int i = 0; i < MAX_THREADS; i++) {
thread_data_array[i].buffer = buffer;
thread_data_array[i].start = start;
thread_data_array[i].end = start + chunk_size + (i < remaining ? 1 : 0);
thread_data_array[i].word_count = 0;
pthread_create(&threads[i], NULL, count_words, (void*)&thread_data_array[i]);
start = thread_data_array[i].end;
}
- Threads are created to process the file content in parallel.
- Each thread is assigned a portion of the buffer to count words.
- I think a human would write this in a way where the arithmetic is easier, but maybe not.
- Presumably a human wrote this somewhere and it's just plagiarized.
- I was pretty sure count_words needed a unary & prefix there, but what do I know. It runs fine.
Join Threads
for (int i = 0; i < MAX_THREADS; i++) {
pthread_join(threads[i], NULL);
}
- The main thread waits for all worker threads to finish using pthread_join.
- This ensures that the total word count is accurate before printing.
Calculate Total Word Count
int total_word_count = 0;
for (int i = 0; i < MAX_THREADS; i++) {
total_word_count += thread_data_array[i].word_count;
}
- The total word count is calculated by summing up individual thread word counts.
Print Result
printf("Total word count: %d\n", total_word_count);
free(buffer);
return 0;
- The total word count is printed to the console.
- Memory allocated for the buffer is freed to prevent memory leaks.
- The program terminates successfully.
Today
- ✓ Fork()
- ✓ Hacks
- ✓ Sleep
- ✓ Pthread
- ✓ function pointers
- ✓ arg structs
- ✓ create/exit/join
- ✓ wordcount