A book by Damien Benveniste of AIEdge. Though a work in progress, chapters 2 - 4 available for preview are fantastic.
Look forward to a paperback edition, which I certainly hope to own...
Insights on Java, Big Data, Search, Cloud, Algorithms, Data Science, Machine Learning...
A book by Damien Benveniste of AIEdge. Though a work in progress, chapters 2 - 4 available for preview are fantastic.
Look forward to a paperback edition, which I certainly hope to own...
Mozilla pedigree, AI focus, Open-source, Dev oriented.
Blueprint Hub: Mozilla.ai's Hub of open-source templtaized customizable AI solutions for developers.
Lumigator: Platform for model evaluation and selection. Consists a Python FastAPI backend for AI lifecycle management & capturing workflow data useful for evaluation.
Streamlit is a web wrapper for Data Science projects in pure Python. It's a lightweight, simple, rapid prototyping web app framework for sharing scripts.
Quick notes around Chinchilla Scaling Law/ Limits & beyond for DeepLearning and LLMs.
Factors
The intuitive way,
Beyond common sense, the theoretical foundations linking the factors aren't available right now. Perhaps the nature of the problem is it's hard (NP).
The next best thing then, is to somehow work out the relationships/ bounds empirically. To work with existing Deep Learning models, LLMs, etc using large data sets spanning TB/ PB of data, Trillions of parameters, etc using large compute budget cumulatively spanning years.
Papers by Hestness & Narang, Kaplan, Chinchilla are all attempts along the empirical route. So are more recent papers like Mosaic, DeepSeek, MoE, Llam3, Microsoft among many others.
Key take away being,
References
Diffusion
Non-equilibrium Thermodynamics
Gaussian Noise
Variational Inference
Loss Functions
Score Based Generative Model
Conditional (Guided) Generation
Latent Varible Generative Model
References:
References
References
A way to categorize Spark API features:
The diagram is based on code within various Spark test suites.
Continuing with the same PySpark (ver 2.1.0, Python3.5, etc.) setup explained in an earlier post. In order to connect to the mocked Kinesis stream on Localstack from PySpark use the kinesis_wordcount_asl.py script located in Spark external/ (connector/) folder.
(a) Update value of master in kinesis_wordcount_asl.py
Update value of master(local[n], spark://localhost:7077, etc) in SparkContext in kinesis_wordcount_asl.py:
sc = SparkContext(appName="PythonStreamingKinesisWordCountAsl",master="local[2]")
(b) Add aSpark compiled jars to Spark Driver/ Executor Classpath
As explained in step (III) of an earlier post, to work with Localstack a few changes were done to the KinesisReceiver.scala onStart() to explicitly set endPoint on kinesis, dynamoDb, cloudWatch clients. Accordingly the compiled aSpark jars with the modifications need to be added to Spark Driver/ Executor classpath.
export aSPARK_PROJ_HOME="/Downlaod/Location/aSpark"
export SPARK_CLASSPATH="${aSPARK_PROJ_HOME}/target/original-aSpark_1.0-2.1.0.jar:${aSPARK_PROJ_HOME}/target/scala-2.11/classes:${aSPARK_PROJ_HOME}/target/scala-2.11/jars/*"
(c) Ensure SPARK_HOME, PYSPARK_PYTHON & PYTHONPATH variables are exported.
(d) Run kinesis_wordcount_asl
python3.5 ${SPARK_HOME}/external/kinesis-asl/src/main/python/examples/streaming/kinesis_wordcount_asl.py SampleKinesisApplication myFirstStream http://localhost:4566/ us-east-1
aws --endpoint-url=http://localhost:4566 kinesis put-record --stream-name myFirstStream --partition-key 123 --data "testdata abcd"
In this post we get a Spark streaming application working with AWS Kinesis stream, a mocked version of Kinesis running locally on Localstack. In earlier posts we have explained how to get Localstack running and various AWS services up on Localstack. The client connections to AWS services (Localstack) is done using AWS cli and AWS Java-Sdk v1.
Environment: This set-up continues on a Ubuntu20.04, with Java-8, Maven-3.6x, Docker-24.0x, Python3.5, PySpark/ Spark-2.1.0, Localstack-3.8.1, AWS Java-Sdk-v1 (ver.1.12.778),
Once the Localstack installation is done, steps to follow are:
(I) Start Localstack
# Start locally
localstack start
That should get Localstack should be running on: http://localhost:4566
(II) Check Kinesis services from CLI on Localstack
# List Streams
aws --endpoint-url=http://localhost:4566 kinesis list-streams
# Create Stream
aws --endpoint-url=http://localhost:4566 kinesis create-stream --stream-name myFirstStream --shard-count 1
# List Streams
aws --endpoint-url=http://localhost:4566 kinesis list-streams
# describe-stream-summary
aws --endpoint-url=http://localhost:4566 kinesis describe-stream-summary --stream-name myFirstStream
# Put Record
aws --endpoint-url=http://localhost:4566 kinesis put-record --stream-name myFirstStream --partition-key 123 --data "testdata abcd"
aws --endpoint-url=http://localhost:4566 kinesis put-record --stream-name myFirstStream --partition-key 123 --data "testdata efgh"
(III) Connect to Kinesis from Spark Streaming
# Build
mvn install -DskipTests=true -Dcheckstyle.skip
# Run JavaKinesisWordCountASL with Localstack
(IV) Add Data to Localstack Kinesis & View Counts on Console
a) Put Record from cli
aws --endpoint-url=http://localhost:4566 kinesis put-record --stream-name myFirstStream --partition-key 123 --data "testdata abcd"
aws --endpoint-url=http://localhost:4566 kinesis put-record --stream-name myFirstStream --partition-key 123 --data "testdata efgh"
b) Alternatively Put records from Java Kinesis application
Download, build & run AmazonKinesisRecordProducerSample.java
c) Now check the output console of JavaKinesisWordCountASL run in step (III) above. Counts of the words streamed from Localstack Kinesis will be displayed on the console.
In continuation to the earlier post regarding debugging Pyspark, here we show how to debug the Spark Scala/ Java side. Spark is a distributed processing environment and has Scala Api's for connecting from different languages like Python & Java. The high level Pyspark Architecture is shown here.
For debugging the Spark Scala/ Java components as these run within the JVM, it's easy to make use of Java Tooling Options for remote debugging from any compatible IDE such as Idea (Eclipse longer supports Scala). A few points to remember:
Steps:
Environment: Ubuntu-20.04 having Java-8, Spark/Pyspark (ver 2.1.0), Python3.5, Idea-Intelli (ver 2024.3), Maven3.6
(I) Idea Remote JVM Debugger
In Idea > Run/ Debug Config > Edit > Remote JVM Debug.
(II)(a) Debug Spark Standlone cluster
Key features of the Spark Standalone cluster are:
In order to Debug lets say some Executor, a Spark Standalone cluster could be started off with 1 Master, 1 Worker, 1 Executor.
# Start Master (Check http://localhost:8080/ to get Master URL/ PORT)
./sbin/start-master.sh
# Start Slave/ Worker
./sbin/start-slave.sh spark://MASTER_URL:<MASTER_PORT>
# Add Jvm tooling to extraJavaOption to spark-defaults.conf
spark.executor.extraJavaOptions -agentlib:jdwp=transport=dt_socket,server=n,address=localhost:5005,suspend=n
# The value could instead be passed as a conf to SparkContext in Python script:
from pyspark.conf import SparkConf
confVals = SparkConf()
confVals.set("spark.executor.extraJavaOptions","-agentlib:jdwp=transport=dt_socket,server=n,address=localhost:5005,suspend=y")
sc = SparkContext(master="spark://localhost:7077",appName="PythonStreamingStatefulNetworkWordCount1",conf=confVals)
(II)(b) Debug locally with master="local[n]"
export JAVA_TOOL_OPTIONS="-agentlib:jdwp=transport=dt_socket,server=n,suspend=n,address=5005"
(III) Execute PySpark Python script
python3.5 ${SPARK_HOME}/examples/src/main/python/streaming/network_wordcount.py localhost 9999
This should start off the Pyspark & connect the Executor JVM to the waiting Idea Remote debugger instance for debugging.
An earlier post shows how to run Pyspark (Spark 2.1.0) in Eclipse (ver 2024-06 (4.32)) using the PyDev (ver 12.1) plugin. The OS is Ubuntu-20.04 with Java-8, & an older version of Python3.5 compatible with PySpark (2.1.0).
While the Pyspark code runs fine within Eclipse, when trying to Debug an error is thrown:
Pydev: Unexpected error setting up the debugger: Socket closed".
This is due to a higher Python requirement (>3.6) for pydevd debugger module within PyDev. Details from the PyDev installations page clearly state that Python3.5 is compatible only with PyDev9.3.0. So it's back to square one.
Install/ replace Pydev 12.1 with PyDev 9.3 in Eclipse
Test debugging Pyspark
Refer to the steps to Run Pyspark on PyDev in Eclipse, & ensure the PyDev Interpreter is python3.5, PYSPARK_PYTHON variable and PYTHONPATH are correctly setup.
Finally, right click on network_wordcount.py > Debug as > Python run
(Set up Debug Configurations > Arguments & provide program arguments, e.g. "localhost 9999", & any breakpoints in the python code to test).
This post captures the steps to get Spark (ver 2.1) working within Eclipse (ver 2024-06 (4.32)) using the PyDev (ver 12.1) plugin. The OS is Ubuntu-20.04 with Java-8, Python 3.x & Maven 3.6.
(I) Compile Spark code
The Spark code is downloaded & compiled from a location "SPARK_HOME".
export SPARK_HOME="/SPARK/DOWNLOAD/LOCATION"
cd ${SPARK_HOME}
mvn install -DskipTests=true -Dcheckstyle.skip -o
(Issue: For a "Failed to execute goal org.scalastyle:scalastyle-maven-plugin:0.8.0:check":
Copy scalastyle-config.xml to the sub-project (next to pom.xml) having the error.
(II) Compile Pyspark
(a) Install Pyspark dependencies
sudo apt-get install pandoc
pip3 install pypandoc==1.5
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt-get install python3.5
(b) Build Pyspark
cd ${SPARK_HOME}/python
export PYSPARK_PYTHON=python3.5
# Build - creates ${SPARK_HOME}/python/build
python3.5 setup.py
# Dist - creates ${SPARK_HOME}/python/dist
python3.5 setup.py sdist
(c) export PYTHON_PATH
export PYTHONPATH=$PYTHONPATH:${SPARK_HOME}/python/:${SPARK_HOME}/python/lib/py4j-0.10.4-src.zip:${SPARK_HOME}/python/pyspark/shell.py;
(III) Run Pyspark from console
Pyspark setup is done & stanalone examples code should run. Ensure variables ${SPARK_HOME}, ${PYSPARK_PYTHON} & ${PYTHONPATH} are all correctly exported (steps (I), (II)(b) & (II)(c) above):
python3.5 ${SPARK_HOME} /python/build/lib/pyspark/examples/src/main/python/streaming/network_wordcount.py localhost 9999
(IV) Run Pyspark on PyDev in Eclipse
(a) Eclipse with PyDev plugin installed:
Set-up tested on Eclipse (ver 2024-06 (4.32.0)) and PyDev plugin (ver 12.1x).
(b) Import the spark project in Eclipse
There would be compilation errors due to missing Spark Scala classes.
(c) Add Target jars for Spark Scala classes
Eclipse no longer has support for Scala so the corresponding Spark Scala classes are missing. A work around is to add the Scala target jars compiled using mvn (in step (I) above) manually to:
spark-example > Properties > Java Build Path > Libraries
(d) Add PyDev Interpreter for Python3.5
Go to: spark-example > Properties > PyDev - Interpreter/ Grammar > Click to confure an Interpreter not listed > Open Interpreter Preferences Page > New > Choose from List:
& Select /usr/bin/python3.5
On the same page, under the Environment tab add a variable named "PYSPARK_PYTHON" having value "python3.5"
(e) Set up PYTHONPATH for PyDev
spark-example > Properties > PyDev - PYTHONPATH
${SPARK_HOME}/python/
${SPARK_HOME}/python/lib/py4j-0.10.4-src.zip
${SPARK_HOME}/python/lib/py4j-0.10.4-src.zip
With that Pyspark should be properly set-up within PyDev.
(f) Run Pyspark from Eclipse
Right click on network_wordcount.py > Run as > Python run
(You can further change Run Configurations > Arguments & provide program arguments, e.g. "localhost 9999")
Sad that Scala IDE for Eclipse is no longer supported. While it was a great to have Scala integrated within Eclipse, guess the headwinds were too strong!
Next up on Mock for clouds is Moto. Moto is primarily for running tests within the Python ecosystem.
Moto does offer a standalone server mode for a other langauges. General sense was that the standalone Moto server would offer the AWS services which will be accessible from the cli & non-Python SDKs. Gave Moto a shot with the same AWS services tried with Localstack.
(I) Set-up
While installing Moto ran into a couple of dependency conflicts across moto, boto3, botocore, requests, s3transfer & in turn with the installed awscli. With some effort reached a sort of dynamic equillibrium with (installed via pip):
(II) Start Moto Server
# Start Moto
moto_server -p3000
# Start Moto as Docker (Sticking to this option)
docker run --rm -p 5000:5000 --name moto motoserver/moto:latest
(III) Invoke services on Moto
(a) S3
# Create bucket
aws --endpoint-url=http://localhost:5000 s3 mb s3://test-buck
# Copy item to bucket
aws --endpoint-url=http://localhost:5000 s3 cp a1.txt s3://test-buck
# List bucket
aws --endpoint-url=http://localhost:5000 s3 ls s3://test-buck
--
(b) SQS
# Create queue
aws --endpoint-url=http://localhost:5000 sqs create-queue --queue-name test-q
# List queues
aws --endpoint-url=http://localhost:5000 sqs list-queues
# Get queue attribute
aws --endpoint-url=http://localhost:5000 sqs get-queue-attributes --queue-url http://localhost:5000/123456789012/test-q --attribute-names All
--
(c) IAM
## Issue: Moto does a basic check of user role & gives an AccessDeniedException when calling Lambda CreateFunction operation
## So have to create a specific IAM role (https://github.com/getmoto/moto/issues/3944#issuecomment-845144036) in Moto for the purpose.
aws iam --region=us-east-1 --endpoint-url=http://localhost:5000 create-role --role-name "lambda-test-role" --assume-role-policy-document "some policy" --path "/lambda-test/"
--
(d) Lambda
# Create Java function
aws --endpoint-url=http://localhost:5000 lambda create-function --function-name test-j-div --zip-file fileb://original-java-basic-1.0-SNAPSHOT.jar --handler example.HandlerDivide::handleRequest --runtime java8.al2 --role arn:aws:iam::123456789012:role/lambda-test/lambda-test-role
# List functions
aws --endpoint-url=http://localhost:5000 lambda list-functions
# Invoke function (Fails!)
aws --endpoint-url=http://localhost:5000 lambda invoke --function-name test-j-div --payload '[235241,17]' outputJ.txt
The invoke function fails with the message:
"WARNING - Unable to parse Docker API response. Defaulting to 'host.docker.internal'
<class 'json.decoder.JSONDecodeError'>::Expecting value: line 1 column 1 (char 0)
error running docker: Error while fetching server API version: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory'))".
Retried this from AWS Java-SDK & for other nodejs & python function but nothing worked. While this remains unsolved for now, check out Lambci docker option next.
(IV) Invoke services on Lambci Lambda Docker Images:
Moto Lambda docs also mention its dependent docker images from the lambci/lambda & mlupin/docker-lambda (for new ones). Started off with a slightly older java8.al2 docker image from lambci/lambda.
# Download lambci/lambda:java8.al2
docker pull lambci/lambda:java8.al2
# Run lambci/lambda:java8.al2.
## Ensure to run from the location which has the unzipped (unjarred) Java code
## Here it's run from a folder called data_dir_java which has the unzipped (unjarred) class file folders: com/, example/, META-INF/, net/
docker run -e DOCKER_LAMBDA_STAY_OPEN=1 -p 9001:9001 -v "$PWD":/var/task:ro,delegated --name lambcijava8al2 lambci/lambda:java8.al2 example.HandlerDivide::handleRequest
# Invoke Lambda
aws --endpoint-url=http://localhost:9001 lambda invoke --function-name test-j-div --payload '[235241,17]' outputJ.txt
This works!
Continuing with Localstack, next is a closer look into the code to deploy and execute AWS Lambda code on Localstack from AWS Java-Sdk-v1. The localstack-lambda-java-sdk-v1 code uses the same structure used in localstack-aws-sdk-examples & fills in for the missing AWS Lambda bit.
The LambdaService class has 3 primary methods - listFunctions(), createFunction() & invokeFunction(). The static AWSLambda client is setup with Mock credentials and pointing to the Localstack endpoint.
The main() method first creates the function (createFunction()), if it does not exist.
If all goes well, control returns to main() which invokes the listFunctions() to show details of the created Lambda function (& all others functions existing).
Finally, there is call from main() to invokeFunction() method.
Comments welcome, localstack-lambda-java-sdk-v1 is available to play around!
In continuation to the earlier post on mocks for clouds, this article does a deep dive into getting up & running with Localstack. This is a consolidation of the steps & best practices shared here, here & here. The Localstack set-up is on a Ubuntu-20.04, with Java-8x, Maven-3.8x, Docker-24.0x.
(I) Set-up
# Install awscli
sudo apt-get install awscli
# Install localstack ver 3.8
## Issue1: By default pip pulls in version 4.0, which gives an error:
## ERROR: Could not find a version that satisfies the requirement localstack-ext==4.0.0 (from localstack)
python3 -m pip install localstack==3.8.1
--
# Add to /etc/hosts
127.0.0.1 localhost.localstack.cloud
127.0.0.1 s3.localhost.localstack.cloud
--
# Configure AWS from cli
aws configure
aws configure set default.region us-east-1
aws configure set aws_access_key_id test
aws configure set aws_secret_access_key test
## Manually configure AWS
Add to ~/.aws/config
endpoint_url = http://localhost:4566
## Add mock credentials
Add to ~/.aws/credentials
aws_access_key_id = test
aws_secret_access_key = test
--
# Download docker images needed by the Lambda function
## Issue 2: Do this before hand, Localstack gets stuck
## at the download image stage unless it's already available
## Pull java:8.al2
docker pull public.ecr.aws/lambda/java:8.al2
## Pull nodejs (required for other nodejs Lambda functions)
docker pull public.ecr.aws/lambda/nodejs:18
## Check images downloaded
docker image ls
(II) Start Localstack
# Start locally
localstack start
# Start as docker (add '-d' for daemon)
## Issue 3: Local directory's mount should be as per sample docker-compose
docker-compose -f docker-compose-localstack.yaml up
# Localstack up on URL's
http://localhost:4566
http://localhost.localstack.cloud:4566
# Check Localstack Health
curl http://localhost:4566/_localstack/info
curl http://localhost:4566/_localstack/health
(III) AWS services on Localstack from CLI
(a) S3
# Create bucket named "test-buck"
aws --endpoint-url=http://localhost:4566 s3 mb s3://test-buck
# Copy item to bucket
aws --endpoint-url=http://localhost:4566 s3 cp a1.txt s3://test-buck
# List bucket
aws --endpoint-url=http://localhost:4566 s3 ls s3://test-buck
--
(b) Sqs
# Create queue named "test-q"
aws --endpoint-url=http://localhost:4566 sqs create-queue --queue-name test-q
# List queues
aws --endpoint-url=http://localhost:4566 sqs list-queues
# Get queue attribute
aws --endpoint-url=http://localhost:4566 sqs get-queue-attributes --queue-url http://sqs.us-east-1.localhost.localstack.cloud:4566/000000000000/test-q --attribute-names All
--
(c) Lambda
aws --endpoint-url=http://localhost:4566 lambda list-functions
# Create Java function
aws --endpoint-url=http://localhost:4566 lambda create-function --function-name test-j-div --zip-file fileb://original-java-basic-1.0-SNAPSHOT.jar --handler example.HandlerDivide::handleRequest --runtime java8.al2 --role arn:aws:iam::000000000000:role/lambda-test
# List functions
aws --endpoint-url=http://localhost:4566 lambda list-functions
# Invoke Java function
aws --endpoint-url=http://localhost:4566 lambda invoke --function-name test-j-div --payload '[200,9]' outputJ.txt
# Delete function
aws --endpoint-url=http://localhost:4566 lambda delete-function --function-name test-j-div
(IV) AWS services on Localstack from Java-SDK
# For S3 & Sqs - localstack-aws-sdk-examples, java sdk
# For Lambda - localstack-lambda-java-sdk-v1
With your air. With your smog. With your AQIs. With your chart topping PM levels. Delhi this annual event of yours, wish we could skip!
Familiar noises echoing from the four estates are no balm to the troubled sinuses. They shout at the top of their lungs, we cough & sneeze from the bottom of ours.
Solution, now what's that? From whom, when, where & why? Since one's can't really run away perhaps we need to just hibernate or hide. Better still, grin and bear this way of lieF (sic).
There are well known scenarios like caching, pooling, etc wherein object reuse is common. Testing these cases using a framework like Mockito could run into problems. Esp if there's a need to verify the arguments sent by the Caller of a Service, where the Service is mocked.
ArgumentCaptor (mockito) fails because it keeps references to the argument obj, which due to reuse by the caller only have the last/ latest updated value.
The discussion here led to using Void Answer as one possible way to solve the issue. The following (junit-3+, mockito-1.8+, commons-lang-2.5) code explains the details.
1. Service:
public class Service {
public void serve(MutableInt value) {
System.out.println("Service.serve(): "+value);
}
2. Caller:
public class Caller {
public void callService(Service service) {
MutableInt value = new MutableInt();
value.setValue(1);
service.serve(value);
value.setValue(2);
service.serve(value);
}
...
3.Tests:
public class MutableArgsTest extends TestCase{
List<MutableInt> multiValuesWritten;
@Mock
Service service;
/**
* Failure with ArgumentCaptor
*/
public void testMutableArgsWithArgCaptorFail() {
Caller caller = new Caller();
ArgumentCaptor<MutableInt> valueCaptor =
ArgumentCaptor.forClass(MutableInt.class);
caller.callService(service);
verify(service,times(2)).serve(valueCaptor.capture());
// AssertionFailedError: expected:<[1, 2]> but was:<[2, 2]>"
assertEquals(Arrays.asList(new MutableInt(1),
new MutableInt(2)),valueCaptor.getAllValues());
}
/**
* Success with Answer
*/
public void testMutableArgsWithDoAnswer() {
Caller caller = new Caller();
doAnswer(new CaptureArgumentsWrittenAsMutableInt<Void>()).
when(service).serve(any(MutableInt.class));
caller.callService(service);
verify(service,times(2)).serve(any(MutableInt.class));
// Works!
assertEquals(new MutableInt(1),multiValuesWritten.get(0));
assertEquals(new MutableInt(2),multiValuesWritten.get(1));
}
/**
* Captures Arguments to the Service.serve() method:
* - Multiple calls to serve() happen from the same caller
* - Along with reuse of MutableInt argument objects by the caller
* - Argument value is copied to a new MutableInt object & that's captured
* @param <Void>
*/
public class CaptureArgumentsWrittenAsMutableInt<Void> implements Answer<Void>{
public Void answer(InvocationOnMock invocation) {
Object[] args = invocation.getArguments();
multiValuesWritten.add(new MutableInt(args[0].toString()));
return null ;
}
}
}
This post has info on manually restoring a Joomla 4.3.4 set-up across two servers. While both are Linux systems the configuration differ slightly including the OS, Php , DB, etc. Various issues were faced & overcome in doing the restoration.
Background info:
- Source:
Ubuntu 22.04, Php 8.2, Joomla 4.3.4, Apache, Maria DB, Addon Plugins (AddToAny, LazyDb, Komento, SexyPolling)
- Destination:
Ubuntu 20.04, Php 7.4, Joomla 4.3.4, Apache, MySql 8.0
- Latest DB dump & htdocs folder (including all files, modules, plugins, media, images etc.) from Source was transferred to Destination server via Ftp before hand.
Steps:
1) DB Import
1.1) Create user, db, grant all permission to user.
1.2) Import data to the created db from the latest DB dump of the source.
1.2.1) ERROR 1366 (HY000) at line 2273: Incorrect integer value: '' for column 'checked_out' at row 1. Solution is to set NO_ENGINE_SUBSTITUTION & then import:
SET @@GLOBAL.sql_mode= 'NO_ENGINE_SUBSTITUTION';
1.3) ERROR 1101 (42000) at line 10692: BLOB, TEXT, GEOMETRY or JSON column 'country' can't have a default value
- Using the solution found online & the DB dump sql import script was changed to set DEFAULT values for the problematic text columns country, city, etc
// Modify the sexypolling plugin CREATE TABLE script:
CREATE TABLE `#_sexy_votes` (
`id_vote` int(10) unsigned NOT NULL AUTO_INCREMENT,
....
`country` text NOT NULL DEFAULT (_utf8mb4'Unknown'),
`city` text NOT NULL DEFAULT (_utf8mb4'Unknown'),
`region` text NOT NULL DEFAULT (_utf8mb4'Unknown'),
`countrycode` text NOT NULL DEFAULT (_utf8mb4'Unknown'),
PRIMARY KEY (`id_vote`),
.....
) ENGINE=MyISAM DEFAULT CHARSET=utf8mb3 COLLATE=utf8mb3_general_ci;
2) Download Joomla_4.3.4-Stable-Full_Package.zip from joomla.org
2.1) Unzip Joomla_4.3.4-Stable-Full_Package.zip to /var/www/html & rename folder to <site_name>
2.2) Set up site configuration.php (/var/www/html/<site_name>/configuration.php)
- Add db, username, password
- Add tmp_path & log_path in
public $log_path = '/var/www/html/<site_name>/administrator/logs';
public $tmp_path = '/var/www/html/<site_name>/tmp';
3) Restore Joomla modules, plugins, languages, etc from file system Ftp backup location of Source.
4) Additional system settings on Destination
4.1) Add missing Php modules: "Call to undefined function" error
4.1.1) simplexml_load_file()
sudo apt-get install php7.4-xml
4.1.2) "IntlTimeZone" module missing
sudo apt-get install php7.4-intl
4.2) Increase Php upload limit (/etc/php/7.4/apache2/php.ini)
post_max_size = 38M
upload_max_filesize = 32M
4.3) Restart apache
sudo systemctl reload apache2
5) Recovering from J4 Red Error Page of death
5.1) Redirection to installation/index.php:
- With an error "500 - Whoops, looks like something went wrong".
- Needed to delete the installation folder, to stop the redirection.
5.2) Next, 404 Component not found error on the home page:
---
404 **Component not found.**
Call stack
# Function Location
1 () JROOT/libraries/src/Component/ComponentHelper.php:296
2 Joomla\CMS\Component\ComponentHelper::renderComponent() JROOT/libraries/src/Application/SiteApplication.php:210
3 Joomla\CMS\Application\SiteApplication->dispatch() JROOT/libraries/src/Application/SiteApplication.php:251
4 Joomla\CMS\Application\SiteApplication->doExecute() JROOT/libraries/src/Application/CMSApplication.php:293
5 Joomla\CMS\Application\CMSApplication->execute() JROOT/includes/app.php:61
6 require_once() JROOT/index.php:32
---
5.3) Checked DB connections using a custom php script:
No issues connecting to DB with username/ password!
5.4) Enable Debugging/ Logging:
5.4.1) Logging in php.ini (/etc/php/7.4/apache2/php.ini)
----Turn on logging-----
display_errors = On
html_errors = On
display_startup_errors = On
log_errors = On
error_log = /var/log/apache2/php_errors.log
5.4.2) Logging in configuration.php (/var/www/html/<site_name>/configuration.php)
// Change to true from false
public $debug = true;
public $debug_lang = true;
// Change to 'maximum' from 'default'
public $error_reporting = 'maximum';
// Change to 1 from 0
public $log_everything = 1;
With those J! Info started showing up in the browser along with the error stack trace & queries.
5.5) Root cause analysis
5.5.1) Checked the specific php libraries:
libraries/src/Component/ComponentHelper.php:296
libraries/src/Application/SiteApplication.php:210, etc..
- Using var_dump($component), on SiteApplication.php:210 found:
$component = NULL
- The same '$component = "com_content"' on the home page of a default Joomla application (unzip Joomla 4.3.4 zip & install & check value on Joomla home page).
- Test with hard coded $component = "com_content" in libraries/src/Application/SiteApplication.php:210
if(empty($component)){
$component = "com_content";
}
- With this 404 was gone & a broken site home page came up with a few Category links listed out
- Clicking on Category link was showing that "No Article linked to Category", despite there being several Articles imported from source db dump.
5.5.2) Localizing issue with Content/ Article loading:
- Hit the direct Article url:
http://<site_name>/index.php?option=com_content&view=article&id=<article_id>
- This gave another error"404 Article not found", though the specific <article_id> was present in the database.
- J! Info provided the corresponding php file and db query used to fetch article by id which was giving no result
5.5.3) Issue with imports of all "datetime DEFAULT NULL" fields
- On exploring the query further, it was seen to have checks for publish_up & publish_down dates. These needed to be either NULL or set to a date earlier (/later) than date NOW for publish_up (/publish_down).
- In the "#_content" table publish_up & publish_down dates values were showing as "0000-00-00 00:00:00" (i.e. were imported as 0) in place of NULL.
This was causing all records being filtered out.
- It also meant that wherever the "datetime default NULL" fields were imported the same issue was happening.
- A check revealed 30 other J! tables with the same issue.
- Prepared a script to update each of these datetime fields to NULL in the 30 tables.
UPDATE `#_content` SET `checked_out_time` = NULL, `publish_up` = NULL, `publish_down` = NULL;
UPDATE `#_categories` SET `checked_out_time` = NULL;
.... for all the affected tables!
With that the issue was resolved & site home page became functional!
For anyone installing the plugin SexyPolling 4.1.7 on a Joomla 4.3.4 with a MySql 8.0 db on an Ubuntu system there may be issues with default value for Text fields. More specifically error in setting default value for country, city, etc TEXT fields:
"BLOB, TEXT, GEOMETRY or JSON column 'country' can't have a default value"
1) There is a solution for MySql 8.0 to set DEFAULT values for TEXT fields. The CREATE statement for table `#_sexy_votes` needs to be changed to:
// Modify the sexypolling plugin CREATE TABLE script:
CREATE TABLE `#_sexy_votes` (
`id_vote` int(10) unsigned NOT NULL AUTO_INCREMENT,
....
`country` text NOT NULL DEFAULT (_utf8mb4'Unknown'),
`city` text NOT NULL DEFAULT (_utf8mb4'Unknown'),
`region` text NOT NULL DEFAULT (_utf8mb4'Unknown'),
`countrycode` text NOT NULL DEFAULT (_utf8mb4'Unknown'),
PRIMARY KEY (`id_vote`),
.....
) ENGINE=MyISAM DEFAULT CHARSET=utf8mb3 COLLATE=utf8mb3_general_ci;
Further, in order for the CREATE table changes to take effect & not get dropped/ altered, a few of the plugin's installer scripts needs to be modified. The installation then needs to be done from a folder (or a modified zip file) in which the modified plugin installer scripts are present (instead of the downloaded joomla_plugin_sexypolling_reloaded_v4.1.7.zip file), as explained next.
2) Installation of plugin from /tmp folder
2.1) Unzip the downloaded joomla_plugin_sexypolling_reloaded_v4.1.7.zip file to the site /tmp folder as mentioned in the site configuration.php file (e.g. /var/www/html/<site_name>/tmp). Give the folder proper read/write permissions.
2.2) Change CREATE TABLE `#_sexy_votes` command in /tmp/com_sexypolling/admin/install/sql/install.sql:
Set the DEFAULT value for country, city, region, countrycode to "DEFAULT (_utf8mb4'Unknown')" as mentioned above.
2.3) Remove ALTER TABLE `#_sexy_votes` command from /tmp/com_sexypolling/scriptfile.php.
Put an invalid condition check on line 235 of the script to stop ALTER TABLE for `#_sexy_votes` to run:
$alterSexyVotes=false;
if($alterSexyVotes && is_array($columns_titles)) {
...
2.4) Finally, install: